Multi-GPU training - Githubissues

bennyguo / instant-nsr-pl

Neural Surface reconstruction based on Instant-NGP. Efficient and customizable boilerplate for your research projects. Train NeuS in 10min!

MIT License

857 stars 84 forks source link

Multi-GPU training #68

Open adkAurora opened 1 year ago

adkAurora commented 1 year ago

I am training Neus using two GPUs. Do I need to change any config parameters? reduce the traner max_steps in half?

bennyguo commented 1 year ago

Hi! In DDP, model.train_num_rays and model.max_train_num_rays are defined per device, so you could either halving these values or simply train for fewer iterations :)

adkAurora commented 1 year ago

Hi! In DDP, model.train_num_rays and model.max_train_num_rays are defined per device, so you could either halving these values or simply train for fewer iterations :)

Hi! I halved the parameters model.train_num_rays and model.max_train_num_rays on V100 graphics card. The training time for single GPU is 15 minutes, and for two GPUs it is 20 minutes. It appears that using multiple cards based on the current network structure does not bring significant time gains, is that true?

guesspalm commented 8 months ago

Hi! In DDP, model.train_num_rays and model.max_train_num_rays are defined per device, so you could either halving these values or simply train for fewer iterations :)

Hi! I halved the parameters model.train_num_rays and model.max_train_num_rays on V100 graphics card. The training time for single GPU is 15 minutes, and for two GPUs it is 20 minutes. It appears that using multiple cards based on the current network structure does not bring significant time gains, is that true?

Hi, have you solve this problem?