xy-guo / LIGA-Stereo

Code for LIGA-Stereo Detector, ICCV'21
Apache License 2.0
90 stars 18 forks source link

when I run these scripts,there're some questions #4

Open Xie-PC opened 2 years ago

Xie-PC commented 2 years ago

Thanks to your sharing,but when i first run following codes in my docker containers './scripts/dist_train.sh 1 dev configs/stereo/kitti_models/liga.yaml' or './scripts/dist_test_ckpt.sh 1 ./configs/stereo/kitti_models/liga.yaml ./ckpt/pretrained_liga.pth' nothing to show! If I cancle this processing by ctrl+c, run it again that will show '''bash Traceback (most recent call last): File "tools/train.py", line 211, in main() File "tools/train.py", line 73, in main args.tcp_port, args.local_rank, backend='nccl' File "/root/LIGA-Stereo-master/liga/utils/common_utils.py", line 181, in init_dist_pytorch world_size=num_gpus File "/root/miniconda3/envs/liga/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 422, in init_process_group store, rank, world_size = next(rendezvous_iterator) File "/root/miniconda3/envs/liga/lib/python3.7/site-packages/torch/distributed/rendezvous.py", line 126, in _tcp_rendezvous_handler store = TCPStore(result.hostname, result.port, world_size, start_daemon, timeout) RuntimeError: Address already in use Traceback (most recent call last): File "/root/miniconda3/envs/liga/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/root/miniconda3/envs/liga/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/root/miniconda3/envs/liga/lib/python3.7/site-packages/torch/distributed/launch.py", line 261, in main() File "/root/miniconda3/envs/liga/lib/python3.7/site-packages/torch/distributed/launch.py", line 257, in main cmd=cmd) subprocess.CalledProcessError: Command '['/root/miniconda3/envs/liga/bin/python', '-u', 'tools/train.py', '--local_rank=0', '--launcher', 'pytorch', '--fix_random_seed', '--sync_bn', '--save_to_file', '--cfg_file', 'configs/stereo/kitti_models/liga.yaml', '--exp_name', 'dev']' returned non-zero exit status 1. ''' How should I solve it?

WeiSQ-zju commented 2 years ago

I met the same problem, and I solve it by change liga.yaml to liga.3d-and-bev.yaml. ./scripts/dist_train.sh 1 dev configs/stereo/kitti_models/liga.3d-and-bev.yaml

Xie-PC commented 2 years ago

thanks for you reply, but what you say is useless for me, I still have this question.

xy-guo commented 2 years ago

The python program is not completely killed. Try to find the pid and kill (or killall python if you only run this python program.)

monstre0731 commented 2 years ago

I met the same problem, and I solve it by change liga.yaml to liga.3d-and-bev.yaml. ./scripts/dist_train.sh 1 dev configs/stereo/kitti_models/liga.3d-and-bev.yaml

Hi, did you face any problems like "cannot import ** from mmcv.cnn"? I installed mmcv-full and mmdet. I also tried different versions of mmcv and mmdet, I didn't find one can run the test/train model. Any suggestions would be helpful. Thanks a lot!

xy-guo commented 2 years ago

@QingwuLiu-polymtl Could you show the complete error log? It should be good if you have installed mmcv-full instead of mmcv.