genforce / idinvert

[ECCV 2020] In-Domain GAN Inversion for Real Image Editing
https://genforce.github.io/idinvert/
MIT License
461 stars 65 forks source link

CUDA, cuDNN and NCCL versions #46

Closed basitanees closed 3 years ago

basitanees commented 3 years ago

Hi,

I am using 2 Tesla v100s. My program gets killed when using mode than 1 gpu. Screenshot 2021-05-07 082333

When using 8 Tesla_t4, it gets killed as well (with some memory errors which were removed by allowing growth and limiting memory fraction) Screenshot 2021-05-06 174203

My guess is it could be due to incompatibility issues. Which version of CUDA, cuDNN and NCCL versions are you using for your implementation? I am currently using:

Any help would be appreciated.

Regards,

zhujiapeng commented 3 years ago

You could try with CUDA=10.0, cudnn=7.6.4, tensorflow-gpu==1.12.0.

basitanees commented 3 years ago

Thank you for your feedback. It was actually related to CPU RAM. Increasing the RAM solved the issue while using your mentioned cuda and tensorflow versions.