zhanghang1989 / PyTorch-Encoding

A CV toolkit for my papers.
https://hangzhang.org/PyTorch-Encoding/
MIT License
2.04k stars 452 forks source link

return an eroor when using train_dist.py instead train.py #296

Closed wanghao9610 closed 4 years ago

wanghao9610 commented 4 years ago

I used the train_dist.py instead train.py, the cmd is the same as using train.py, below is returned error: File "train_dist.py", line 326, in mp.spawn(main_worker, nprocs=ngpus_per_node, args=(ngpus_per_node, args)) File "/data2/hwang/softwares/anaconda3/envs/encoding/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 162, in spawn process.start() File "/data2/hwang/softwares/anaconda3/envs/encoding/lib/python3.8/multiprocessing/process.py", line 121, in start self._popen = self._Popen(self) File "/data2/hwang/softwares/anaconda3/envs/encoding/lib/python3.8/multiprocessing/context.py", line 283, in _Popen return Popen(process_obj) File "/data2/hwang/softwares/anaconda3/envs/encoding/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in init super().init(process_obj) File "/data2/hwang/softwares/anaconda3/envs/encoding/lib/python3.8/multiprocessing/popen_fork.py", line 19, in init self._launch(process_obj) File "/data2/hwang/softwares/anaconda3/envs/encoding/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch reduction.dump(process_obj, fp) File "/data2/hwang/softwares/anaconda3/envs/encoding/lib/python3.8/multiprocessing/reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) _pickle.PicklingError: Can't pickle <function main_worker at 0x7f465ede04c0>: attribute lookup main_worker on main failed

Does anyone meet the same error? How to solve it? Thanks. @zhanghang1989

zhanghang1989 commented 4 years ago

Are you using PyTorch 1.4.0 ? There are some issues with newer version.