Open zhanglirong1999 opened 2 years ago
Paste the error of COCO128 here and describe this dataset as well :)
If you use the official training model to evaluate coco128, the error as follows: 2022-03-30 15:56:05 | ERROR | yolox.core.launch:100 - An error has been caught in function 'launch', process 'MainProcess' (26317), thread 'MainThread' (140210491134144): Traceback (most recent call last):
File "/home/lirong/anaconda3/envs/yolo/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
└ ModuleSpec(name='yolox.tools.train', loader=<_frozen_importlib_external.SourceFileLoader object at 0x7f850d1d0a50>, origin='/...
File "/home/lirong/anaconda3/envs/yolo/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
│ └ {'name': 'main', 'doc': None, 'package': 'yolox.tools', 'loader': <_frozen_importlib_external.SourceFileL...
└ <code object
File "/home/lirong/YOLO_RUN/YOLOX/tools/train.py", line 138, in
File "/home/lirong/YOLO_RUN/YOLOX/yolox/core/launch.py", line 100, in launch main_func(*args) │ └ (╒═══════════════════╤═══════════════════════════════════════════════════════════════════════════════════════════════════════... └ <function main at 0x7f850d1b9f80>
File "/home/lirong/YOLO_RUN/YOLOX/tools/train.py", line 106, in main trainer.train_in_orca() │ └ <function Trainer.train_in_orca at 0x7f850c4cb680> └ <yolox.core.trainer.Trainer object at 0x7f852f735690>
File "/home/lirong/YOLO_RUN/YOLOX/yolox/core/trainer.py", line 233, in train_in_orca self.evaluate_and_save_model_orca(model) │ │ └ YOLOX( │ │ (backbone): YOLOPAFPN( │ │ (backbone): CSPDarknet( │ │ (stem): Focus( │ │ (conv): BaseConv( │ │ (conv): ... │ └ <function Trainer.evaluate_and_save_model_orca at 0x7f850c4cbdd0> └ <yolox.core.trainer.Trainer object at 0x7f852f735690>
File "/home/lirong/YOLO_RUN/YOLOX/yolox/core/trainer.py", line 484, in evaluate_and_save_model_orca self.is_distributed │ └ False └ <yolox.core.trainer.Trainer object at 0x7f852f735690>
File "/home/lirong/YOLO_RUN/YOLOX/yolox/exp/yolox_base.py", line 323, in eval return evaluator.evaluate(model, is_distributed, half) │ │ │ │ └ False │ │ │ └ False │ │ └ YOLOX( │ │ (backbone): YOLOPAFPN( │ │ (backbone): CSPDarknet( │ │ (stem): Focus( │ │ (conv): BaseConv( │ │ (conv): ... │ └ <function COCOEvaluator.evaluate at 0x7f852e1dfc20> └ <yolox.evaluators.coco_evaluator.COCOEvaluator object at 0x7f8505017cd0>
File "/home/lirong/YOLO_RUN/YOLOX/yolox/evaluators/coco_evaluator.py", line 189, in evaluate data_list.extend(self.convert_to_coco_format(outputs, info_imgs, ids)) │ │ │ │ │ │ └ tensor([[1], │ │ │ │ │ │ [2], │ │ │ │ │ │ [3], │ │ │ │ │ │ [4], │ │ │ │ │ │ [5]]) │ │ │ │ │ └ [tensor([480, 426, 428, 425, 640]), tensor([640, 640, 640, 640, 481])] │ │ │ │ └ [tensor([[ 3.0659e+02, -1.2641e+00, 3.2204e+02, 2.4427e+02, 9.7571e-01, │ │ │ │ 9.5652e-01, 4.5000e+01], │ │ │ │ [ 2.07... │ │ │ └ <function COCOEvaluator.convert_to_coco_format at 0x7f852e1dfb90> │ │ └ <yolox.evaluators.coco_evaluator.COCOEvaluator object at 0x7f8505017cd0> │ └ <method 'extend' of 'list' objects> └ []
File "/home/lirong/YOLO_RUN/YOLOX/yolox/evaluators/coco_evaluator.py", line 224, in convert_to_coco_format label = self.dataloader.dataset.class_ids[int(cls[ind])] │ │ │ │ │ └ 0 │ │ │ │ └ tensor([75., 58., 60., 47., 32., 13., 49., 32.]) │ │ │ └ [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 3... │ │ └ <yolox.data.datasets.coco.COCODataset object at 0x7f8505017d90> │ └ <torch.utils.data.dataloader.DataLoader object at 0x7f8505017e50> └ <yolox.evaluators.coco_evaluator.COCOEvaluator object at 0x7f8505017cd0>
The difference between the full COCO and COCO128 is COCO128 has only 128 images in train2017 folder and 5 images in val2017 folder. When yolox loads the data from folder and gets json from annotations, there are some problems in output and we can not fix it. So use the full COCO instead of COCO mini(COCO128).
Some similar issues‘ link: yolox-s with default config cannnot converge in mini-coco128: https://github.com/Megvii-BaseDetection/YOLOX/issues/995
the best AP is 0.00: https://github.com/Megvii-BaseDetection/YOLOX/issues/344 https://github.com/Megvii-BaseDetection/YOLOX/issues/1199
Using yolox training COCO128 dataset will result in some errors, such as getting incorrect output from evaluate function. So we need to train on the full COCO dataset. The full COCO dataset has three folders: annotations, train2017, val2017. There are 118287 images in train2017 and 5000 images in val2017. We need use full COCO dataset instead of mini dataset