Open ligua opened 5 years ago
I have also attempted the single GPU script which failed to work on 1 GeForce GTX TITAN X. Is the script designed for 24GB GPU?
Thanks and Best Regards
Even I'm facing this issue with 8x NVIDIA V100s in a cloud environment. The first GPU is more than 80% utilized but the rest are highlt under-utilized.
I am guessing there is a sychronization issue in the code.
I am guessing there is a sychronization issue in the code.
Yeah maybe
Hi I am using the script provided for multi-gpu training. However, there seems to be a significant overhead in the first gpu for multi-gpu training. May I ask whether this is normal.
The following is the gpu memory usage(I am training on 4 12GB GeForce GTX TITAN X) memory.used [MiB], memory.free [MiB] 11385 MiB, 822 MiB 2976 MiB, 9231 MiB 2976 MiB, 9231 MiB 2976 MiB, 9231 MiB
The following is the script I use: CUDA_VISIBLE_DEVICES=$GPUS python train.py --name label2city_512 --label_nc 35 --loadSize 256 --use_instance --fg --gpu_ids 0,1,2,3 --n_gpus_gen 3 --n_frames_total 6 --max_frames_per_gpu 1 --debug
May I ask why there seems to be a significant overhead in the first gpu? Is it caused by the dataparallel?
Also, i always got the cuda out of memory bug when I set the loadSize to be 512. May I ask why? Is the script provided initially designed for 24GB GPUs?
Thank you so much. Best regards