Closed Yx1322441675 closed 3 years ago
Hi,
It seems to be the problem with mmcv version. Could you please try mmcv=1.1.5 and pytorch=1.7.0 ?
Hi,
It seems to be the problem with mmcv version. Could you please try mmcv=1.1.5 and pytorch=1.7.0 ?
The following is my setting.I have changed the version.But it still takes error .Maybe you can help me Package Version Location
addict 2.4.0 certifi 2020.12.5 cycler 0.10.0 Cython 0.29.22 joblib 1.0.1 kiwisolver 1.3.1 matplotlib 3.4.0 mkl-fft 1.3.0 mkl-random 1.1.1 mkl-service 2.3.0 mmcv-full 1.1.5 mmdet 2.4.0 /home/goo/yx/AlignPS mmpycocotools 12.0.3 numpy 1.19.2 olefile 0.46 opencv-python 4.5.1.48 Pillow 8.1.2 pip 21.0.1 pyparsing 2.4.7 python-dateutil 2.8.1 PyYAML 5.4.1 scikit-learn 0.24.1 scipy 1.6.2 setuptools 52.0.0.post20210125 six 1.15.0 sklearn 0.0 terminaltables 3.1.0 threadpoolctl 2.1.0 torch 1.7.0 torchaudio 0.7.0a0+ac17b64 torchvision 0.8.0 typing-extensions 3.7.4.3 wheel 0.36.2 yapf 0.31.0
Traceback (most recent call last):
File "./tools/test_results.py", line 75, in
And i don't know where is the file "results_1000.pkl"?My training results don't have it.Maybe i should change it?Maybe it actually doesn't exist?
Now, your actual issue is " RuntimeError: CUDA error: out of memory", please make sure you have enough GPU memory. It needs about 4G memory on GPU.
The "results_1000.pkl" will be generated if you successfully run "./tools/dist_test.sh", it will be saved in "work_dirs/${TESTPATH}/results_1000.pkl"
The model name is not right, I've updated the test script, please try.
Traceback (most recent call last):
File "./tools/test_results_prw.py", line 313, in
It seems some of your detection results are empty. Could you please check the detection results in results_1000.pkl?
I also add a judgment here: https://github.com/daodaofr/AlignPS/blob/c20cf329b2934a8693e2064435d3e3f65c496095/tools/test_results_prw.py#L144
You may try it.
I also got the similar problem, and mmcv=1.1.5, pytorch=1.7.0 are correct, but I don't know why.
Traceback (most recent call last):
File "./tools/test_results.py", line 75, in
@jiabeiwangTJU
Hi, your isuue is "work_dirs/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_dcn_4x4_1x_cuhk_reid_1500_stage1_fpncat_dcn_epoch24_multiscale_focal_x4_bg-2_lconv3dcn_sub_triqueue_dcn0/24.pth is not a checkpoint file".
Could you check your checkpoint path or try to load it manually.
In my checkpoint path, there is a "latest.pth". So I change the TESTNAME='cuhk_alignps.pth' into 'latest.pth', that works. Thanks.
(open-mmlab) goo@goo-Z390-GAMING-X:~/yx/AlignPS$ sh run_test.sh loading annotations into memory... Done (t=0.12s) creating index... index created! /home/goo/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/cnn/bricks/conv_module.py:100: UserWarning: ConvModule has norm and bias at the same time warnings.warn('ConvModule has norm and bias at the same time') [ ] 0/6978, elapsed: 0s, ETA:/home/goo/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/functional.py:3328: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead. warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.") /home/goo/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/functional.py:3458: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. "See the documentation of nn.Upsample for details.".format(mode) Traceback (most recent call last): File "./tools/test.py", line 226, in
main()
File "./tools/test.py", line 187, in main
args.gpu_collect)
File "/home/goo/yx/AlignPS/mmdet/apis/test.py", line 98, in multi_gpu_test
result = model(return_loss=False, rescale=True, data)
File "/home/goo/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, *kwargs)
File "/home/goo/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 705, in forward
output = self.module(inputs[0], kwargs[0])
File "/home/goo/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, *kwargs)
File "/home/goo/yx/AlignPS/mmdet/core/fp16/decorators.py", line 51, in new_func
return old_func(args, kwargs)
File "/home/goo/yx/AlignPS/mmdet/models/detectors/base.py", line 170, in forward
return self.forward_test(img, img_metas, kwargs)
File "/home/goo/yx/AlignPS/mmdet/models/detectors/base.py", line 147, in forward_test
return self.simple_test(imgs[0], img_metas[0], kwargs)
File "/home/goo/yx/AlignPS/mmdet/models/detectors/single_stage_reid.py", line 118, in simple_test
outs, img_metas, rescale=rescale)
File "/home/goo/yx/AlignPS/mmdet/core/fp16/decorators.py", line 131, in new_func
return old_func(args, kwargs)
File "/home/goo/yx/AlignPS/mmdet/models/dense_heads/fcos_reid_head_focal_sub_triqueue.py", line 454, in get_bboxes
img_shape = img_metas[img_id]['img_shape']
TypeError: 'DataContainer' object is not subscriptable
Killing subprocess 6428
Traceback (most recent call last):
File "/home/goo/anaconda3/envs/open-mmlab/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/goo/anaconda3/envs/open-mmlab/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/goo/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launch.py", line 340, in
main()
File "/home/goo/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launch.py", line 326, in main
sigkill_handler(signal.SIGTERM, None) # not coming back
File "/home/goo/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler
raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/goo/anaconda3/envs/open-mmlab/bin/python', '-u', './tools/test.py', '--local_rank=0', './configs/fcos/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_dcn_4x4_1x_cuhk_reid_1500_stage1_fpncat_dcn_epoch24_multiscale_focal_x4_bg-2_lconv3dcn_sub_triqueue_dcn0.py', 'work_dirs/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_dcn_4x4_1x_cuhk_reid_1500_stage1_fpncat_dcn_epoch24_multiscale_focal_x4_bg-2_lconv3dcn_sub_triqueue_dcn0/epoch_24.pth', '--launcher', 'pytorch', '--out', 'work_dirs/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_dcn_4x4_1x_cuhk_reid_1500_stage1_fpncat_dcn_epoch24_multiscale_focal_x4_bg-2_lconv3dcn_sub_triqueue_dcn0/results_1000.pkl']' returned non-zero exit status 1.
Traceback (most recent call last): File "./tools/test_results.py", line 75, in
with open(os.path.join(results_path, 'results_1000.pkl'), 'rb') as fid:
FileNotFoundError: [Errno 2] No such file or directory: '/home/goo/yx/AlignPS/work_dirs/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_dcn_4x4_1x_cuhk_reid_1500_stage1_fpncat_dcn_epoch24_multiscale_focal_x4_bg-2_lconv3dcn_sub_triqueue_dcn0/results_1000.pkl'
fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_dcn_4x4_1x_cuhk_reid_1500_stage1_fpncat_dcn_epoch24_multiscale_focal_x4_bg-2_lconv3dcn_sub_triqueue_dcn0
I don't know why .Maybe you can help me .Thank you !