您好,我在执行到train.py时发生如下报错:
Traceback (most recent call last):
File "train.py", line 25, in
trainer.train(save_model_path=args.save_model_path,
File "/home/WA22201024/MASR-develop/masr/trainer.py", line 474, in train
dist.init_process_group(backend='nccl')
File "/home/WA22201024/anaconda3/envs/masr1/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 754, in init_process_group
store, rank, world_size = next(rendezvous_iterator)
File "/home/WA22201024/anaconda3/envs/masr1/lib/python3.8/site-packages/torch/distributed/rendezvous.py", line 236, in _env_rendezvous_handler
rank = int(_get_env_or_raise("RANK"))
File "/home/WA22201024/anaconda3/envs/masr1/lib/python3.8/site-packages/torch/distributed/rendezvous.py", line 221, in _get_env_or_raise
raise _env_error(env_var)
ValueError: Error initializing torch.distributed using env:// rendezvous: environment variable RANK expected, but not set
请问这是什么问题,我要如何解决呢
您好,我在执行到train.py时发生如下报错: Traceback (most recent call last): File "train.py", line 25, in
trainer.train(save_model_path=args.save_model_path,
File "/home/WA22201024/MASR-develop/masr/trainer.py", line 474, in train
dist.init_process_group(backend='nccl')
File "/home/WA22201024/anaconda3/envs/masr1/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 754, in init_process_group
store, rank, world_size = next(rendezvous_iterator)
File "/home/WA22201024/anaconda3/envs/masr1/lib/python3.8/site-packages/torch/distributed/rendezvous.py", line 236, in _env_rendezvous_handler
rank = int(_get_env_or_raise("RANK"))
File "/home/WA22201024/anaconda3/envs/masr1/lib/python3.8/site-packages/torch/distributed/rendezvous.py", line 221, in _get_env_or_raise
raise _env_error(env_var)
ValueError: Error initializing torch.distributed using env:// rendezvous: environment variable RANK expected, but not set
请问这是什么问题,我要如何解决呢