kenshohara / 3D-ResNets-PyTorch

3D ResNets for Action Recognition (CVPR 2018)
MIT License
3.88k stars 930 forks source link

KeyError of distributed training #244

Closed byrsongyuxin closed 3 years ago

byrsongyuxin commented 3 years ago

(py37)Traceback (most recent call last): File "main.py", line 415, in opt = get_opt() File "main.py", line 72, in get_opt opt.dist_rank = int(os.environ["OMPI_COMM_WORLD_RANK"]) File "/home/wuming/anaconda3/envs/action/lib/python3.7/os.py", line 681, in getitem raise KeyError(key) from None KeyError: 'OMPI_COMM_WORLD_RANK'

byrsongyuxin commented 3 years ago

I replace the PMI_RANK to OMPI_COMM_WORLD_RANK

WUSHUANGPPP commented 3 years ago

Could you please tell us how to run distributed training? I met same error with you,but replace PIM_RANK not work for me.....@byrsongyuxin Thank you very much!