longyunf / radiant

21 stars 3 forks source link

how to use multi gpu training? #14

Closed stanny880913 closed 5 months ago

stanny880913 commented 5 months ago

I use CUDA_VISIBLE_DEVICES=0,1,2,3 python scripts/train_radiant_pgd.py --num_gpus 4 --samples_per_gpu 2 --epochs 10 --lr 0.001 --workers_per_gpu 2 to run training,but when i check nvidia-smi , it't only one GPU is 80-95%, others 0-2%,how can I train with 4 GPU at a time?

thank you

longyunf commented 5 months ago

Check if the list available_gpu_ids includes IDs of all 4 GPUs: https://github.com/longyunf/radiant/blob/cf5355396d42ef17940e29ef8f9e3cabfd8035c3/scripts/train_radiant_pgd.py#L319

If true, the model is supposed to use 4 GPUs via the following line: https://github.com/longyunf/radiant/blob/cf5355396d42ef17940e29ef8f9e3cabfd8035c3/scripts/train_radiant_pgd.py#L334

stanny880913 commented 5 months ago

Check if the list available_gpu_ids includes IDs of all 4 GPUs:

https://github.com/longyunf/radiant/blob/cf5355396d42ef17940e29ef8f9e3cabfd8035c3/scripts/train_radiant_pgd.py#L319

If true, the model is supposed to use 4 GPUs via the following line:

https://github.com/longyunf/radiant/blob/cf5355396d42ef17940e29ef8f9e3cabfd8035c3/scripts/train_radiant_pgd.py#L334

thank you