implus / GFocal

Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection, NeurIPS2020
Apache License 2.0
587 stars 73 forks source link

IndexError: list index out of range at the end of the first epcoch #7

Closed ghost closed 4 years ago

ghost commented 4 years ago

Dear team, At the end of the first epoch I get:

2020-07-06 18:17:56,734 - mmdet - INFO - Epoch [1][300/359]     lr: 0.00732, eta: 1:58:24, time: 0.539, data_time: 0.006, memory: 4519, loss_qfl: 0.1841, loss_bbox: 0.4076, loss_dfl: 0.2519, loss: 0.8436
2020-07-06 18:18:23,969 - mmdet - INFO - Epoch [1][350/359]     lr: 0.00799, eta: 1:57:23, time: 0.545, data_time: 0.006, memory: 4519, loss_qfl: 0.1798, loss_bbox: 0.4058, loss_dfl: 0.2517, loss: 0.8373
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 

**508/506, 29.7 task/s, elapsed: 17s, ETA:     0s**

Traceback (most recent call last):
  File "./tools/train.py", line 151, in <module>
    main()
  File "./tools/train.py", line 147, in main
    meta=meta)
  File "/root/sharedfolder/production/GFocal/mmdet/apis/train.py", line 165, in train_detector
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
  File "/opt/conda/lib/python3.6/site-packages/mmcv/runner/runner.py", line 371, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/mmcv/runner/runner.py", line 285, in train
    self.call_hook('after_train_epoch')
  File "/opt/conda/lib/python3.6/site-packages/mmcv/runner/runner.py", line 238, in call_hook
    getattr(hook, fn_name)(self)
  File "/root/sharedfolder/production/GFocal/mmdet/core/evaluation/eval_hooks.py", line 76, in after_train_epoch
    self.evaluate(runner, results)
  File "/root/sharedfolder/production/GFocal/mmdet/core/evaluation/eval_hooks.py", line 33, in evaluate
    results, logger=runner.logger, **self.eval_kwargs)
  File "/root/sharedfolder/production/GFocal/mmdet/datasets/coco.py", line 326, in evaluate
    result_files, tmp_dir = self.format_results(results, jsonfile_prefix)
  File "/root/sharedfolder/production/GFocal/mmdet/datasets/coco.py", line 287, in format_results
    result_files = self.results2json(results, jsonfile_prefix)
  File "/root/sharedfolder/production/GFocal/mmdet/datasets/coco.py", line 217, in results2json
    json_results = self._det2json(results)
  File "/root/sharedfolder/production/GFocal/mmdet/datasets/coco.py", line 155, in _det2json
    data['category_id'] = self.cat_ids[label]
IndexError: list index out of range
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/conda/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/opt/conda/lib/python3.6/site-packages/torch/distributed/launch.py", line 263, in <module>
    main()
  File "/opt/conda/lib/python3.6/site-packages/torch/distributed/launch.py", line 259, in main

This happens only when I pass the --validate option

./tools/dist_train.sh configs/gfl_x101_ms2x.py 4 --validate

Any advice about what may be wrong here?

Thanks,

implus commented 4 years ago

It is the compatibility problem between mmdet v1.0 and v2.0. You can use the official version of mmdetection V2.0 instead (since GFocal is supported officially now) and rebuild its environment. Thanks~