wuhuikai / FastFCN

FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation.
http://wuhuikai.me/FastFCNProject
Other
838 stars 148 forks source link

latest版本训练的时候卡住 #98

Closed Anikily closed 3 years ago

Anikily commented 3 years ago

Namespace(aux=True, aux_weight=0.4, backbone='resnet50', base_size=520, batch_size=16, checkname='psp_res50_ade20k', crop_size=480, cuda=True, dataset='ade20k', dilated=False, dist_backend='nccl', dist_url='tcp://127.0.0.1:1735', epochs=120, ft=False, jpu=True, lateral=False, lr=0.01, lr_scheduler='poly', mode='testval', model='psp', model_zoo=None, momentum=0.9, ms=False, no_cuda=False, no_val=True, rank=0, resume=None, save_folder='experiments/segmentation/results', se_loss=False, se_weight=0.2, seed=1, split='val', start_epoch=0, test_batch_size=16, train_split='train', weight_decay=0.0001, workers=4, world_size=1) Use GPU: 0 for training BaseDataset: base_size 520, crop_size 480 ^CTraceback (most recent call last): File "/root/anaconda3/envs/open-mmlab/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/root/anaconda3/envs/open-mmlab/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/root/segmentation/FastFCN-latest/experiments/segmentation/train.py", line 212, in mp.spawn(main_worker, nprocs=ngpus_per_node, args=(ngpus_per_node, args)) File "/root/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 200, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "/root/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 158, in start_processes while not context.join(): File "/root/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 78, in join timeout=timeout, File "/root/anaconda3/envs/open-mmlab/lib/python3.7/multiprocessing/connection.py", line 921, in wait ready = selector.select(timeout) File "/root/anaconda3/envs/open-mmlab/lib/python3.7/selectors.py", line 415, in select fd_event_list = self._selector.poll(timeout) KeyboardInterrupt

手动停止后就显示如上。

wuhuikai commented 3 years ago

Can you run it with multiple GPUs?