oandrienko / fast-semantic-segmentation

ICNet and PSPNet-50 in Tensorflow for real-time semantic segmentation
220 stars 41 forks source link

FAILED TO GET CONVOLUTION ALGORITHIM. RTX 2060, CUDA 10, cuDNN 7.4.2.24 , tensorflow gpu (tb-nightly-gpu) #11

Closed deep28vish closed 5 years ago

deep28vish commented 5 years ago

my env: windows 10 RTX 2060 py - 3.6.6 tensorflow - tb-nightly-gpu cuDNN- 7.4.2.24 FOR cuda 10 CUDA 10.0

dataset - cifar10

ERROR:

ANY suggestions?

runfile('D:/PY_TF/MNIST_AE_CNN/OPEN_CV_ICUCNN-20190119T181516Z-001/OPEN_CV_ICUCNN/OPEN_CV_CNN001.py', wdir='D:/PY_TF/MNIST_AE_CNN/OPEN_CV_ICUCNN-20190119T181516Z-001/OPEN_CV_ICUCNN')
Using TensorFlow backend.
WARNING: Logging before flag parsing goes to stderr.
W0128 19:48:13.545505  5732 deprecation.py:506] From C:\Anaconda3\envs\test1_import\lib\site-packages\keras\backend\tensorflow_backend.py:3445: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
Epoch 1/1
Traceback (most recent call last):

  File "<ipython-input-1-556c55792f64>", line 1, in <module>
    runfile('D:/PY_TF/MNIST_AE_CNN/OPEN_CV_ICUCNN-20190119T181516Z-001/OPEN_CV_ICUCNN/OPEN_CV_CNN001.py', wdir='D:/PY_TF/MNIST_AE_CNN/OPEN_CV_ICUCNN-20190119T181516Z-001/OPEN_CV_ICUCNN')

  File "C:\Anaconda3\envs\test1_import\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 704, in runfile
    execfile(filename, namespace)

  File "C:\Anaconda3\envs\test1_import\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 108, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "D:/PY_TF/MNIST_AE_CNN/OPEN_CV_ICUCNN-20190119T181516Z-001/OPEN_CV_ICUCNN/OPEN_CV_CNN001.py", line 64, in <module>
    history = model1.fit(train_imgs, train_ans_one_hot, batch_size = batch_size, epochs= epochs, verbose =1)

  File "C:\Anaconda3\envs\test1_import\lib\site-packages\keras\engine\training.py", line 1039, in fit
    validation_steps=validation_steps)

  File "C:\Anaconda3\envs\test1_import\lib\site-packages\keras\engine\training_arrays.py", line 199, in fit_loop
    outs = f(ins_batch)

  File "C:\Anaconda3\envs\test1_import\lib\site-packages\keras\backend\tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)

  File "C:\Anaconda3\envs\test1_import\lib\site-packages\keras\backend\tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)

  File "C:\Anaconda3\envs\test1_import\lib\site-packages\tensorflow\python\client\session.py", line 1440, in __call__
    run_metadata_ptr)

  File "C:\Anaconda3\envs\test1_import\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 544, in __exit__
    c_api.TF_GetCode(self.status.status))

UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[{{node conv2d_1/convolution}}]]
     [[metrics/acc/Mean/_113]]``
oandrienko commented 5 years ago

Hey thanks for your interest in the project and so sorry for the late reply. Hoping you solved this.

This is due to a compatibility issue with CUDA/cuDNN and the tensorflow version you are using. You will need to play around and check what version of cuDNN and CUDA your Tensorflow version was compiled with to solve your issue. I think nightly build have supported CUDA 10 for the last while but haven't looked in to this. I have stuck to using older TF versions such as v10 with CUDA 9 as that is what is officially supported previously. Hope this bit of extra information help .