jiazhihao / TASO

The Tensor Algebra SuperOptimizer for Deep Learning
Apache License 2.0
682 stars 89 forks source link

I get "CUDNN failure: CUDNN_STATUS_INTERNAL_ERROR" when I create new graph #102

Open AnouarITI opened 11 months ago

AnouarITI commented 11 months ago

Hi, I am new to taso. I am trying to get my first step, so I installed everything as it is supposed to be. Then, I used this example code :

import taso
import onnx

graph = taso.new_graph()

When I try to create a new graph using taso I get this cuda error:

CUDNN failure: CUDNN_STATUS_INTERNAL_ERROR
/workspace/taso/src/cudnn/ops_cudnn.cu:25
Aborting...

Is this related to the CUDA version installed? This is what I get when I check the nvcc version.

86da0fd15f0a:/workspace# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:09_PDT_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0

I am using the nvcr.io/nvidia/tensorflow:20.08-tf2-py3 from NVIDIA NGC

AnouarITI commented 11 months ago

Hi, I figured out that this is a memory issue (That is weird since I am working with the Tesla v100 SXM3). I checked an earlier issue where you suggested modifying lines 53 to 56 in TASO/include/taso/ops.h as follows:

#define MAX_DIM 4
#define MAX_NUM_SPLITS 16
#define MAX_NUM_INPUTS 6
#define MAX_NUM_OUTPUTS 6

But I still get the same issue. Any other suggestion?