Closed mengfanShi closed 4 years ago
Hi, I'm glad you are interested in this repo. It seems that many reasons could arise this error. For example: https://discuss.pytorch.org/t/resolved-batchnorm1d-cudnn-status-not-supported/3049 I use Cuda 10.1 and Cuddn 7.4. And Pytorch 1.0, 1.1, 1.2, 1.3 and 1.4 work fine. Please first make sure the packages' versions are proper before other tries.
Hi :) I have received you email. I'm using HDF5 1.10 (better multiprocessing handling) that supports SWMR mode. https://github.com/jialee93/Improved-Body-Parts/blob/316e71fa93e1dc444b1cfd4fc312c21c13bfe93f/py_cocodata_server/py_data_iterator.py#L42
There is a good discussion here, and I concluded the discussion here.
Thanks for your response : ) I find that even though I use h5py downloaded by pip (version 2.10.0), it can also supports SWMR mode, is it still necessary to install HDF5 to rebuilt h5py ? I also test the train_parallel.py, same error occurs T_T. BTW, I use Cuda 10.1 and Cndnn 10.0, Pytorch 1.4.0
I have rebuilt the h5py by HDF5, but the error still occurs. It's hard to locate the problem T_T.
Sorry for what you are suffering😓. I have never met such errors before. I haven't used Cndnn 10.0. What if you set torch.backends.cudnn.enabled = False?
I have tried it before, but it seems useless.
:( Then, I have no idea for now. Feel free to discuss if more information is found.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
So glad to see your project, I successfully run the demo, create the h5 file. But when I try to train the model, An error appears just like: RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input. I really hope to get your help, thank you very much.