dist_c10d is not defined training error - distributed_utils

@rgcottrell @tianfeichen @cqlijingwei Hey again. Thanks for all the earlier replies. I could preprocess, train and test everything in Google Colab. But recently, I switched to training it on my Gaming Laptop and i got the error.

dist_c10d is not defined. Can you explain me more about has_c10d etc? Because in Google Colab, these were the parameters. ddp_backend='c10d' distributed_backend='nccl', distributed_init_method=None, distributed_port=-1, distributed_rank=0, distributed_world_size=1 But in the distributed_utils.py, i cannot import torch.distributed as dist_c10d. It always do to the torch.distributed as dist.no_c10d. Can you guide me here? and when i would use init_fn = dist_no_c10d.init_process_group. It would start import the data and all.

Any help would be appreciated. Thanks in advance.

rgcottrell / pytorch-human-performance-gec

dist_c10d is not defined training error - distributed_utils #9