how to use args.distributed

qianqianwang68 / omnimotion

Apache License 2.0

2.12k stars 123 forks source link

how to use args.distributed #30

Closed fangxy100 closed 11 months ago

qianqianwang68 commented 11 months ago

Hi, to use distributed training, you can try python -m torch.distributed.launch --nproc_per_node={NUM_GPUs} train.py ..... The total number of points trained equals NUM_GPUs * args.num_pts, which means that args.num_pts can be reduced to 1/NUM_GPUs of the total number of points desired. The distributed training option is less thoroughly tested so please let me know if you encounter any problems.

nargenziano commented 8 months ago

How about the batch size (i.e., args.num_pairs)?