Questions about using multiple gpu

Alxemade commented 5 years ago

Hi, @yulunzhang First of all, thank you for your open source code, and the results of the reconstruction are impressive. I read the EDSR project and your project source code. I use the commad CUDA_VISIBLE_DEVICES=0,1,2 python main.py --model RCAN --save RCAN_BIX2_G10R20P48 --scale 2 --n_resgroups 10 --n_resblocks 20 --n_feats 64 --reset --chop --save_results --print_model --patch_size 96 --ext sep_reset --n_GPUs 3 to using multiple gpu. The code is ok. But I use the commad watch -n 0.1 nvidia-smi to surveillance the gpu and memory usage. We all know in pytorch the model will copy the model to other gpu if we use multiple gpu, and the data will will be distributed equally to each gpu according to the batch size. In this way, our memory usage should be the same, but in practice, the memory usage is decremented in turn, I would like to ask the author how this is going on. Is it that I ignore the details, but also ask the author to help answer. Thank you.

yulunzhang commented 5 years ago

Hi, The way of multi-gpu is correct. The reason why the used gpu memory of each GPU are not the same is that the batch size (you used 16) cannot be divided by gpu number (you used 3) with no remainder.

If you set the batch size as 15, I think, the used gpu memory should keep the same or very similar.

Alxemade commented 5 years ago

OK, Thank you! I use four GPU and it works well, I will close this issue soon.

yulunzhang / RCAN

Questions about using multiple gpu #27