Open an99990 opened 3 years ago
I met the same problem like you, do you have any insight to solve this?
I met the same problem, do you solve this?
Hi sorry for late response. But I changed python version to a older version and rebuild every package and it worked somehow ..
Hi sorry for late response. But I changed python version to a older version and rebuild every package and it worked somehow
Hi sorry for late response. But I changed python version to a older version and rebuild every package and it worked somehow ..
Ok, I will try this , thank you
I have successfully installed apex and normalSpeed. I have tested normalSpeed with example.py I tried to run
python3 -m torch.distributed.launch --nproc_per_node=1 train_lm.py --gpu '0' --cls $cls -eval_net -checkpoint $tst_mdl -test -test_pose
and i got below error
/home/halodi/.local/share/virtualenvs/RandLA-jKwUC53Q/bin/python3: can't open file 'train_lm.py': [Errno 2] No such file or directory Traceback (most recent call last): File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/halodi/.local/share/virtualenvs/RandLA-jKwUC53Q/lib/python3.8/site-packages/torch/distributed/launch.py", line 263, in <module> main() File "/home/halodi/.local/share/virtualenvs/RandLA-jKwUC53Q/lib/python3.8/site-packages/torch/distributed/launch.py", line 258, in main raise subprocess.CalledProcessError(returncode=process.returncode, subprocess.CalledProcessError: Command '['/home/halodi/.local/share/virtualenvs/RandLA-jKwUC53Q/bin/python3', '-u', 'train_lm.py', '--local_rank=0', '--gpu', '0', '--cls', 'ape', '-eval_net', '-checkpoint', './linemod_pretrained/FFB6D_ape_best.pth.tar', '-test', '-test_pose']' returned non-zero exit status 2.
please help @ethnhe thanks