@rgcottrell @tianfeichen @cqlijingwei Hey again. Thanks for all the earlier replies. I could preprocess, train and test everything in Google Colab. But recently, I switched to training it on my Gaming Laptop and i got the error.
dist_c10d is not defined.
Can you explain me more about has_c10d etc?
Because in Google Colab, these were the parameters.
ddp_backend='c10d'
distributed_backend='nccl',
distributed_init_method=None,
distributed_port=-1,
distributed_rank=0, distributed_world_size=1
But in the distributed_utils.py, i cannot import torch.distributed as dist_c10d. It always do to the torch.distributed as dist.no_c10d. Can you guide me here?
and when i would use init_fn = dist_no_c10d.init_process_group. It would start import the data and all.
@rgcottrell @tianfeichen @cqlijingwei Hey again. Thanks for all the earlier replies. I could preprocess, train and test everything in Google Colab. But recently, I switched to training it on my Gaming Laptop and i got the error.
dist_c10d is not defined. Can you explain me more about has_c10d etc? Because in Google Colab, these were the parameters. ddp_backend='c10d' distributed_backend='nccl', distributed_init_method=None, distributed_port=-1, distributed_rank=0, distributed_world_size=1 But in the distributed_utils.py, i cannot import torch.distributed as dist_c10d. It always do to the torch.distributed as dist.no_c10d. Can you guide me here? and when i would use init_fn = dist_no_c10d.init_process_group. It would start import the data and all.
Any help would be appreciated. Thanks in advance.