frankkramer-lab / MIScnn

A framework for Medical Image Segmentation with Convolutional Neural Networks and Deep Learning
GNU General Public License v3.0
402 stars 116 forks source link

MIScnn No Longer Running in Colab #125

Closed ChrisJWest closed 2 years ago

ChrisJWest commented 2 years ago

Hi! I've previously had no issues with running my MIScnn models in Google Colab. However, as of recently, I am getting this error:

Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[node model/conv3d/Conv3D
 (defined at /usr/local/lib/python3.7/dist-packages/keras/layers/convolutional.py:238)
]] [Op:__inference_train_function_5792]

Errors may have originated from an input operation.
Input Source operations connected to node model/conv3d/Conv3D:
In[0] IteratorGetNext (defined at /usr/local/lib/python3.7/dist-packages/keras/engine/training.py:866)  
In[1] model/conv3d/Conv3D/ReadVariableOp: 
(... follow by lots of locations that didn't seem to have much relevance)

I noticed that MIScnn was upgraded to 1.4.0 so I tried doing a factory reset and downgrading back to 1.3.0 but unfortunately, I get the same error. This could entirely be a problem on my end but would you have any ideas on this? CUDA and package versions seem to be the same when I do this so I'm really not sure what could be going on. Thanks!

ChrisJWest commented 2 years ago

EDIT: I somewhat fixed this bug with a sketchy workaround. It turns out that the issue lay in a discrepancy between cuDNN versions. I'm not sure why this happened, even when I explicitly tried to roll back my versions. Here is the error:

2022-02-28 22:28:50.869979: E tensorflow/stream_executor/cuda/cuda_dnn.cc:359] Loaded runtime CuDNN library: 8.0.5 but source was compiled with: 8.1.0. CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.

I don't think this was anything with MIScnn but rather something with tensorflow and/or google colab. What I ended up doing is rolling back my tensorflow version to 2.4.0. This is quite bad as my MIScnn version (1.3.0) explicitly asks for 2.7.0 or higher, but it seems to work so far so I am happy. Will update / reopen if anything comes up. Thanks!

muellerdo commented 2 years ago

Hello @ChrisJWest,

good to hear that you found a workaround for this issue.

The main problem is, as far as I think, that Goole Colab utilizes Python 3.6/7. However, since 01.01.22, all major packages like NumPy etc switched to a Python 3.8 requirement, which is why dependency issues for Google Colab installation are quite common in the last month, sadly.

Will update / reopen if anything comes up.

Please! In a month, our lab hosts a practical course for students which will work with MIScnn in a google colab environment. So I will probably see all latest dependency issues and try to fix them / can give better advice.

Cheers, Dominik