Megvii-BaseDetection / YOLOX

YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
Apache License 2.0
9.46k stars 2.21k forks source link

训练时报错,完成一个epoch是显示 #1803

Open 666zhouzhou6666 opened 1 month ago

666zhouzhou6666 commented 1 month ago

2024-10-13 23:12:30 | ERROR | yolox.core.launch:98 - An error has been caught in function 'launch', process 'MainProcess' (14668), thread 'MainThread' (35416): Traceback (most recent call last):

File "D:\Gugexaizai\YOLOX-0.2.0\tools\train.py", line 129, in launch( └ <function launch at 0x000001950AB11870>

File "D:\Gugexaizai\YOLOX-0.2.0\yolox\core\launch.py", line 98, in launch main_func(*args) │ └ (╒══════════════════╤════════════════════════════════════════════════════════════════════════════════════════════════════════... └ <function main at 0x000001950BFAB640>

File "D:\Gugexaizai\YOLOX-0.2.0\tools\train.py", line 114, in main trainer.train() │ └ <function Trainer.train at 0x000001950BF2AD40> └ <yolox.core.trainer.Trainer object at 0x000001950BF77010>

File "D:\Gugexaizai\YOLOX-0.2.0\yolox\core\trainer.py", line 72, in train self.train_in_epoch() │ └ <function Trainer.train_in_epoch at 0x000001950BF51120> └ <yolox.core.trainer.Trainer object at 0x000001950BF77010>

File "D:\Gugexaizai\YOLOX-0.2.0\yolox\core\trainer.py", line 82, in train_in_epoch self.after_epoch() │ └ <function Trainer.after_epoch at 0x000001950BFAAC20> └ <yolox.core.trainer.Trainer object at 0x000001950BF77010>

File "D:\Gugexaizai\YOLOX-0.2.0\yolox\core\trainer.py", line 207, in after_epoch self.evaluate_and_save_model() │ └ <function Trainer.evaluate_and_save_model at 0x000001950BFAAEF0> └ <yolox.core.trainer.Trainer object at 0x000001950BF77010>

File "D:\Gugexaizai\YOLOX-0.2.0\yolox\core\trainer.py", line 302, in evaluate_and_save_model ap50_95, ap50, summary = self.exp.eval( │ │ └ <function Exp.eval at 0x000001950BFABD90> │ └ ╒══════════════════╤═════════════════════════════════════════════════════════════════════════════════════════════════════════... └ <yolox.core.trainer.Trainer object at 0x000001950BF77010>

File "D:\Gugexaizai\YOLOX-0.2.0\yolox\exp\yolox_base.py", line 285, in eval return evaluator.evaluate(model, is_distributed, half) │ │ │ │ └ False │ │ │ └ False │ │ └ YOLOX( │ │ (backbone): YOLOPAFPN( │ │ (backbone): CSPDarknet( │ │ (stem): Focus( │ │ (conv): BaseConv( │ │ (conv): ... │ └ <function VOCEvaluator.evaluate at 0x000001950BFA9D80> └ <yolox.evaluators.voc_evaluator.VOCEvaluator object at 0x0000019514545810>

File "D:\Gugexaizai\YOLOX-0.2.0\yolox\evaluators\voc_evaluator.py", line 128, in evaluate eval_results = self.evaluate_prediction(data_list, statistics) │ │ │ └ tensor([ 1.0139, 0.1483, 15.0000], device='cuda:0') │ │ └ {0: (None, None, None), 1: (None, None, None), 2: (None, None, None), 3: (None, None, None), 4: (None, None, None), 5: (None,... │ └ <function VOCEvaluator.evaluate_prediction at 0x000001950BFA9EA0> └ <yolox.evaluators.voc_evaluator.VOCEvaluator object at 0x0000019514545810>

File "D:\Gugexaizai\YOLOX-0.2.0\yolox\evaluators\voc_evaluator.py", line 205, in evaluate_prediction mAP50, mAP70 = self.dataloader.dataset.evaluate_detections( │ │ │ └ <function VOCDetection.evaluate_detections at 0x000001950BFAA710> │ │ └ <yolox.data.datasets.voc.VOCDetection object at 0x0000019514585390> │ └ <torch.utils.data.dataloader.DataLoader object at 0x00000195145449D0> └ <yolox.evaluators.voc_evaluator.VOCEvaluator object at 0x0000019514545810>

File "D:\Gugexaizai\YOLOX-0.2.0\yolox\data\datasets\voc.py", line 265, in evaluate_detections self._write_voc_results_file(all_boxes) │ │ └ [[array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), arr... │ └ <function VOCDetection._write_voc_results_file at 0x000001950BFAA830> └ <yolox.data.datasets.voc.VOCDetection object at 0x0000019514585390>

File "D:\Gugexaizai\YOLOX-0.2.0\yolox\data\datasets\voc.py", line 299, in _write_voc_results_file if dets == []: └ array([], shape=(0, 5), dtype=float32)

ValueError: operands could not be broadcast together with shapes (0,5) (0,)

ChialinSung commented 2 weeks ago

2024-11-07 11:16:42 | INFO | yolox.core.trainer:218 - ---> start train epoch1 2024-11-07 11:16:48 | INFO | yolox.core.trainer:270 - epoch: 1/300, iter: 10/134, gpu mem: 9301Mb, mem: 24.6Gb, iter_time: 0.677s, data_time: 0.059s, total_loss: 15.8, iou_loss: 3.8, l1_loss: 0.0, conf_loss: 10.0, cls_loss: 2.0, lr: 1.392e-07, size: 1280, ETA: 7:33:23 2024-11-07 11:16:54 | INFO | yolox.core.trainer:270 - epoch: 1/300, iter: 20/134, gpu mem: 9301Mb, mem: 24.5Gb, iter_time: 0.592s, data_time: 0.007s, total_loss: 17.0, iou_loss: 3.7, l1_loss: 0.0, conf_loss: 11.3, cls_loss: 2.0, lr: 5.569e-07, size: 1440, ETA: 7:04:50 2024-11-07 11:17:01 | INFO | yolox.core.trainer:270 - epoch: 1/300, iter: 30/134, gpu mem: 9301Mb, mem: 24.5Gb, iter_time: 0.695s, data_time: 0.187s, total_loss: 14.9, iou_loss: 3.8, l1_loss: 0.0, conf_loss: 9.1, cls_loss: 2.0, lr: 1.253e-06, size: 1312, ETA: 7:18:08 2024-11-07 11:17:07 | INFO | yolox.core.trainer:270 - epoch: 1/300, iter: 40/134, gpu mem: 9301Mb, mem: 24.5Gb, iter_time: 0.587s, data_time: 0.027s, total_loss: 14.3, iou_loss: 3.8, l1_loss: 0.0, conf_loss: 8.6, cls_loss: 2.0, lr: 2.228e-06, size: 1408, ETA: 7:06:43 2024-11-07 11:17:14 | INFO | yolox.core.trainer:270 - epoch: 1/300, iter: 50/134, gpu mem: 9301Mb, mem: 24.8Gb, iter_time: 0.690s, data_time: 0.246s, total_loss: 14.3, iou_loss: 3.8, l1_loss: 0.0, conf_loss: 8.7, cls_loss: 1.7, lr: 3.481e-06, size: 1184, ETA: 7:13:41 2024-11-07 11:17:20 | INFO | yolox.core.trainer:270 - epoch: 1/300, iter: 60/134, gpu mem: 9301Mb, mem: 24.6Gb, iter_time: 0.579s, data_time: 0.269s, total_loss: 12.7, iou_loss: 3.4, l1_loss: 0.0, conf_loss: 7.1, cls_loss: 2.1, lr: 5.012e-06, size: 1408, ETA: 7:05:49 2024-11-07 11:17:27 | INFO | yolox.core.trainer:270 - epoch: 1/300, iter: 70/134, gpu mem: 9301Mb, mem: 24.7Gb, iter_time: 0.698s, data_time: 0.148s, total_loss: 10.9, iou_loss: 3.4, l1_loss: 0.0, conf_loss: 5.8, cls_loss: 1.7, lr: 6.822e-06, size: 1376, ETA: 7:11:35 2024-11-07 11:17:33 | INFO | yolox.core.trainer:270 - epoch: 1/300, iter: 80/134, gpu mem: 9301Mb, mem: 24.6Gb, iter_time: 0.575s, data_time: 0.113s, total_loss: 11.0, iou_loss: 3.7, l1_loss: 0.0, conf_loss: 5.6, cls_loss: 1.7, lr: 8.911e-06, size: 1216, ETA: 7:05:39 2024-11-07 11:17:39 | INFO | yolox.core.trainer:270 - epoch: 1/300, iter: 90/134, gpu mem: 9301Mb, mem: 24.6Gb, iter_time: 0.683s, data_time: 0.414s, total_loss: 11.1, iou_loss: 3.6, l1_loss: 0.0, conf_loss: 6.0, cls_loss: 1.5, lr: 1.128e-05, size: 1280, ETA: 7:09:01 2024-11-07 11:17:45 | INFO | yolox.core.trainer:270 - epoch: 1/300, iter: 100/134, gpu mem: 9301Mb, mem: 24.7Gb, iter_time: 0.597s, data_time: 0.098s, total_loss: 10.3, iou_loss: 3.6, l1_loss: 0.0, conf_loss: 5.5, cls_loss: 1.2, lr: 1.392e-05, size: 1248, ETA: 7:05:53 2024-11-07 11:17:52 | INFO | yolox.core.trainer:270 - epoch: 1/300, iter: 110/134, gpu mem: 9301Mb, mem: 24.9Gb, iter_time: 0.678s, data_time: 0.413s, total_loss: 9.1, iou_loss: 3.4, l1_loss: 0.0, conf_loss: 4.5, cls_loss: 1.2, lr: 1.685e-05, size: 1280, ETA: 7:08:14 2024-11-07 11:17:58 | INFO | yolox.core.trainer:270 - epoch: 1/300, iter: 120/134, gpu mem: 9301Mb, mem: 24.8Gb, iter_time: 0.583s, data_time: 0.169s, total_loss: 9.8, iou_loss: 3.3, l1_loss: 0.0, conf_loss: 5.0, cls_loss: 1.5, lr: 2.005e-05, size: 1120, ETA: 7:04:55 2024-11-07 11:18:05 | INFO | yolox.core.trainer:270 - epoch: 1/300, iter: 130/134, gpu mem: 9301Mb, mem: 24.9Gb, iter_time: 0.685s, data_time: 0.371s, total_loss: 9.4, iou_loss: 3.2, l1_loss: 0.0, conf_loss: 5.1, cls_loss: 1.2, lr: 2.353e-05, size: 1408, ETA: 7:07:20 2024-11-07 11:18:07 | INFO | yolox.core.trainer:402 - Save weights to ./YOLOX_outputs/drd_exp 100%|###########################################| 34/34 [00:04<00:00, 7.30it/s] 2024-11-07 11:18:13 | INFO | yolox.evaluators.voc_evaluator:144 - Evaluate in main process... Writing LA VOC results file 2024-11-07 11:18:13 | ERROR | yolox.core.trainer:79 - Exception in training: 2024-11-07 11:18:13 | INFO | yolox.core.trainer:200 - Training of experiment is done and the best AP is 0.00 2024-11-07 11:18:13 | ERROR | yolox.core.launch:98 - An error has been caught in function 'launch', process 'MainProcess' (28400), thread 'MainThread' (139758362833024): Traceback (most recent call last):

File "/home/wsjc/S2021/songjl/220430/yolox/YOLOX-main/tools/train.py", line 138, in launch( └ <function launch at 0x7f1ad420b640>

File "/home/wsjc/S2021/songjl/220430/yolox/YOLOX-main/yolox/core/launch.py", line 98, in launch main_func(*args) │ └ (╒═══════════════════╤═══════════════════════════════════════════════════════════════════════════════════════════════════════... └ <function main at 0x7f1ac2640550>

File "/home/wsjc/S2021/songjl/220430/yolox/YOLOX-main/tools/train.py", line 118, in main trainer.train() │ └ <function Trainer.train at 0x7f1ac24a1090> └ <yolox.core.trainer.Trainer object at 0x7f1ac24a81c0>

File "/home/wsjc/S2021/songjl/220430/yolox/YOLOX-main/yolox/core/trainer.py", line 77, in train self.train_in_epoch() │ └ <function Trainer.train_in_epoch at 0x7f1ac24a1900> └ <yolox.core.trainer.Trainer object at 0x7f1ac24a81c0>

File "/home/wsjc/S2021/songjl/220430/yolox/YOLOX-main/yolox/core/trainer.py", line 88, in train_in_epoch self.after_epoch() │ └ <function Trainer.after_epoch at 0x7f1ac24a1c60> └ <yolox.core.trainer.Trainer object at 0x7f1ac24a81c0>

File "/home/wsjc/S2021/songjl/220430/yolox/YOLOX-main/yolox/core/trainer.py", line 237, in after_epoch self.evaluate_and_save_model() │ └ <function Trainer.evaluate_and_save_model at 0x7f1ac24a1f30> └ <yolox.core.trainer.Trainer object at 0x7f1ac24a81c0>

File "/home/wsjc/S2021/songjl/220430/yolox/YOLOX-main/yolox/core/trainer.py", line 355, in evaluate_and_save_model (ap50_95, ap50, summary), predictions = self.exp.eval( │ │ └ <function Exp.eval at 0x7f1ac24a1870> │ └ ╒═══════════════════╤════════════════════════════════════════════════════════════════════════════════════════════════════════... └ <yolox.core.trainer.Trainer object at 0x7f1ac24a81c0>

File "/home/wsjc/S2021/songjl/220430/yolox/YOLOX-main/yolox/exp/yolox_base.py", line 353, in eval return evaluator.evaluate(model, is_distributed, half, return_outputs=return_outputs) │ │ │ │ │ └ True │ │ │ │ └ False │ │ │ └ False │ │ └ YOLOX( │ │ (backbone): YOLOPAFPN( │ │ (backbone): CSPDarknet( │ │ (stem): Focus( │ │ (conv): BaseConv( │ │ (conv): ... │ └ <function VOCEvaluator.evaluate at 0x7f1ac2487b50> └ <yolox.evaluators.voc_evaluator.VOCEvaluator object at 0x7f1ab0324550>

File "/home/wsjc/S2021/songjl/220430/yolox/YOLOX-main/yolox/evaluators/voc_evaluator.py", line 114, in evaluate eval_results = self.evaluate_prediction(data_list, statistics) │ │ │ └ tensor([ 1.5594, 0.1323, 33.0000], device='cuda:0') │ │ └ {0: (tensor([[ 532.7663, 388.5279, 1316.4070, 2144.0371], │ │ [2980.8784, 2769.1257, 4348.3936, 3396.9895], │ │ [175... │ └ <function VOCEvaluator.evaluate_prediction at 0x7f1ac2487c70> └ <yolox.evaluators.voc_evaluator.VOCEvaluator object at 0x7f1ab0324550>

File "/home/wsjc/S2021/songjl/220430/yolox/YOLOX-main/yolox/evaluators/voc_evaluator.py", line 186, in evaluate_prediction mAP50, mAP70 = self.dataloader.dataset.evaluate_detections(all_boxes, tempdir) │ │ │ │ │ └ '/tmp/tmpq5xf9vjr' │ │ │ │ └ [[array([[1.07199146e+03, 4.26259369e+02, 1.16201208e+03, 8.33316711e+02, │ │ │ │ 5.14281727e-02], │ │ │ │ [2.27488110e+03, 4.... │ │ │ └ <function VOCDetection.evaluate_detections at 0x7f1ac24a0670> │ │ └ <yolox.data.datasets.voc.VOCDetection object at 0x7f1ab03251e0> │ └ <torch.utils.data.dataloader.DataLoader object at 0x7f1ab0324ee0> └ <yolox.evaluators.voc_evaluator.VOCEvaluator object at 0x7f1ab0324550>

File "/home/wsjc/S2021/songjl/220430/yolox/YOLOX-main/yolox/data/datasets/voc.py", line 230, in evaluate_detections self._write_voc_results_file(all_boxes) │ │ └ [[array([[1.07199146e+03, 4.26259369e+02, 1.16201208e+03, 8.33316711e+02, │ │ 5.14281727e-02], │ │ [2.27488110e+03, 4.... │ └ <function VOCDetection._write_voc_results_file at 0x7f1ac24a0790> └ <yolox.data.datasets.voc.VOCDetection object at 0x7f1ab03251e0>

File "/home/wsjc/S2021/songjl/220430/yolox/YOLOX-main/yolox/data/datasets/voc.py", line 264, in _write_voc_results_file if dets == []: └ array([[1.07199146e+03, 4.26259369e+02, 1.16201208e+03, 8.33316711e+02, 5.14281727e-02], [2.27488110e+03, 4.09...

ValueError: operands could not be broadcast together with shapes (11,5) (0,) same question, plz help me ,thanks!

ChialinSung commented 2 weeks ago

把if dets == []:改成if(dets.shape[0]==0): 这样就解决了!