train_distributed.py does not distribute to other gpus

hellojialee / Improved-Body-Parts

Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

https://arxiv.org/abs/1911.10529

258 stars 42 forks source link

Closed minhpvo closed 3 years ago

minhpvo commented 3 years ago

Hi, thanks for the work.

I tried train_distributed.py but the other gpus clearly don't get to run at all. Can you please check?

hellojialee commented 3 years ago

Could you please give more info? How many GPUs are you using (for example 2 gpus: --nproc_per_node=2)?

minhpvo commented 3 years ago

Ah, I did not set the nproc_per_node right. Got it fix now.

Another question out of curiosity, why is train_distrubuted more preferred over train_parallel given most students have a most a machine with 4 gpus?

Thanks!

hellojialee commented 3 years ago

Hi, model distrubuted usually has better efficiency than data parallel. You can refer to some detailed documents describing the mechanism.