Closed novice03 closed 3 years ago
Thanks for your error report and we appreciate it a lot.
Checklist
- I have searched related issues but cannot get the expected help.
- I have read the FAQ documentation but cannot get the expected help.
- The bug has not been fixed in the latest version.
Describe the bug I am training detection models on a dataset from kaggle. I converted the data to COCO format and checked that there are no errors in the conversion. When I use a a pipeline that is different from the default, the validation throws an error. This error does NOT occur if I don't make any modifications to the data pipelines.
Reproduction
- What command or script did you run?
!python tools/train.py configs/siim/siim.py
- Did you make any modifications on the code or config? Did you understand what you have modified?
Yes, I created the config file as follows:
cfg = Config.fromfile('/content/mmdetection/configs/vfnet/vfnet_r50_fpn_mdconv_c3-c5_mstrain_2x_coco.py') cfg.model.bbox_head.num_classes = 1 cfg.classes = ('opacity', ) cfg.data.samples_per_gpu = 4 cfg.data.train.img_prefix = '/content/images/train' cfg.data.train.ann_file = '/content/annotations/train.json' cfg.data.train.classes = cfg.classes cfg.data.val.img_prefix = '/content/images/val' cfg.data.val.ann_file = '/content/annotations/val.json' cfg.data.val.classes = cfg.classes cfg.data.test.img_prefix = '/content/images/val' cfg.data.test.ann_file = '/content/annotations/val.json' cfg.data.test.classes = cfg.classes cfg.evaluation.metric = 'bbox' img_norm_cfg = dict( mean=[0.0, 0.0, 0.0], std=[1.0, 1.0, 1.0], to_rgb=True) cfg.train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Resize', img_scale=[(1333, 480), (1333, 960)], multiscale_mode='range', keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']), ] cfg.data.train.pipeline = cfg.train_pipeline cfg.data.val.pipeline = cfg.train_pipeline cfg.optimizer.lr = 0.02 / 8 cfg.lr_config = dict( policy = 'CosineAnnealing', by_epoch = False, warmup = 'linear', warmup_iters = 500, warmup_ratio = 0.001, min_lr = 1e-07) cfg.runner.max_epochs = 12 cfg.load_from = '/content/drive/MyDrive/vfnet_r50_fpn_mdconv_c3-c5_mstrain_2x_coco_20201027pth-6879c318.pth' cfg.dump('configs/siim/siim.py')
If I understand correctly, cfg.data.train.pipeline and cfg.data.val.pipeline get changed to the train_pipeline above.
- What dataset did you use?
A dataset from Kaggle converted to COCO format.
Environment
- Please run
python mmdet/utils/collect_env.py
to collect necessary environment information and paste it here.You may add addition that may be helpful for locating the problem, such as
- How you installed PyTorch [e.g., pip, conda, source]
- Other environment variables that may be related (such as
$PATH
,$LD_LIBRARY_PATH
,$PYTHONPATH
, etc.)sys.platform: linux Python: 3.7.11 (default, Jul 3 2021, 18:01:19) [GCC 7.5.0] CUDA available: True GPU 0: Tesla P100-PCIE-16GB CUDA_HOME: /usr/local/cuda NVCC: Build cuda_11.0_bu.TC445_37.28845127_0 GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.9.0+cu102 PyTorch compiling details: PyTorch built with: - GCC 7.3 - C++ Version: 201402 - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications - Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb) - OpenMP 201511 (a.k.a. OpenMP 4.5) - NNPACK is enabled - CPU capability usage: AVX2 - CUDA Runtime 10.2 - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70 - CuDNN 7.6.5 - Magma 2.5.2 - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, TorchVision: 0.10.0+cu102 OpenCV: 4.1.2 MMCV: 1.3.9 MMCV Compiler: GCC 7.5 MMCV CUDA Compiler: 11.0 MMDetection: 2.14.0+4853ea1
Error traceback If applicable, paste the error trackback here.
[ ] 0/1267, elapsed: 0s, ETA:Traceback (most recent call last): File "tools/train.py", line 188, in <module> main() File "tools/train.py", line 184, in main meta=meta) File "/content/mmdetection/mmdet/apis/train.py", line 170, in train_detector runner.run(data_loaders, cfg.workflow) File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py", line 127, in run epoch_runner(data_loaders[i], **kwargs) File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py", line 54, in train self.call_hook('after_train_epoch') File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/base_runner.py", line 307, in call_hook getattr(hook, fn_name)(self) File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/hooks/evaluation.py", line 220, in after_train_epoch self._do_evaluate(runner) File "/content/mmdetection/mmdet/core/evaluation/eval_hooks.py", line 17, in _do_evaluate results = single_gpu_test(runner.model, self.dataloader, show=False) File "/content/mmdetection/mmdet/apis/test.py", line 25, in single_gpu_test for i, data in enumerate(data_loader): File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 521, in __next__ data = self._next_data() File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1203, in _next_data return self._process_data(data) File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1229, in _process_data data.reraise() File "/usr/local/lib/python3.7/dist-packages/torch/_utils.py", line 425, in reraise raise self.exc_type(msg) KeyError: Caught KeyError in DataLoader worker process 0. Original Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop data = fetcher.fetch(index) File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp> data = [self.dataset[idx] for idx in possibly_batched_index] File "/content/mmdetection/mmdet/datasets/custom.py", line 192, in __getitem__ return self.prepare_test_img(idx) File "/content/mmdetection/mmdet/datasets/custom.py", line 235, in prepare_test_img return self.pipeline(results) File "/content/mmdetection/mmdet/datasets/pipelines/compose.py", line 40, in __call__ data = t(data) File "/content/mmdetection/mmdet/datasets/pipelines/loading.py", line 370, in __call__ results = self._load_bboxes(results) File "/content/mmdetection/mmdet/datasets/pipelines/loading.py", line 245, in _load_bboxes ann_info = results['ann_info'] KeyError: 'ann_info'
Bug fix If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!
How was your problem solved? Apart from the "KeyError:'anno_info' ", there will be other key missing problems in my projects. Where should I modify them?
Thanks for your error report and we appreciate it a lot.
Checklist
Describe the bug I am training detection models on a dataset from kaggle. I converted the data to COCO format and checked that there are no errors in the conversion. When I use a a pipeline that is different from the default, the validation throws an error. This error does NOT occur if I don't make any modifications to the data pipelines.
Reproduction
Yes, I created the config file as follows:
If I understand correctly, cfg.data.train.pipeline and cfg.data.val.pipeline get changed to the train_pipeline above.
A dataset from Kaggle converted to COCO format.
Environment
python mmdet/utils/collect_env.py
to collect necessary environment information and paste it here.$PATH
,$LD_LIBRARY_PATH
,$PYTHONPATH
, etc.)Error traceback If applicable, paste the error trackback here.
Bug fix If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!