Open minwang-ai opened 3 years ago
We are facing the same errors/warnings. It tells you that your version of TensorFlow is not compatible with the CUDA / cuDNN version installed by PyTorch. However, we do not use any CUDA / cuDNN stuff from TensorFlow. TensorFlow is only used to speed up confusion matrix calculation. Therefore, you can simply ignore these warnings/errors.
However, if you want to get rid of these errors/warnings, you can simply switch to the CPU version of TensorFlow by replacing tensorflow-gpu=1.15.0
with tensorflow-cpu=1.15.0
in the environment file or using pip if the environment is already created. The CPU version works well.
We are facing the same errors/warnings. It tells you that your version of TensorFlow is not compatible with the CUDA / cuDNN version installed by PyTorch. However, we do not use any CUDA / cuDNN stuff from TensorFlow. TensorFlow is only used to speed up confusion matrix calculation. Therefore, you can simply ignore these warnings/errors.
However, if you want to get rid of these errors/warnings, you can simply switch to the CPU version of TensorFlow by replacing
tensorflow-gpu=1.15.0
withtensorflow-cpu=1.15.0
in the environment file or using pip if the environment is already created. The CPU version works well.
Hi Daniel, thank you for your reply! I thought I cannot run the code with errors before and now I noticed that the output file is renewed. The tf errors disappear after loading cuda 10.0.130 and cudnn 10.1_v7.5
We can also ignore the other warning e.g., UserWarning: Detected call of lr_scheduler.step()
before optimizer.step()
. In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step()
before lr_scheduler.step()
. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule.
Yes, since we always pass the epoch explicitly as a parameter (https://github.com/TUI-NICR/ESANet/blob/main/train.py#L264), the warning can be ignored here.
The tf errors disappear after loading cuda 10.0.130 and cudnn 10.1_v7.5
The mIoU decreases in this case so I changed them back to 10.1 and 7.6 respectively.
i replace tensorflow-gpu=1.15.0 with tensorflow-cpu=1.15.0,but when model in the test phase,the same error occurred. Could not dlopen library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory...
Hi all, I followed ReadMe for installation but I got lib errors when I train models via train.py. I have checked tf-GPU, cuda and cudnn versions. Could you help me figure it out?