lyuwenyu / RT-DETR

[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
Apache License 2.0
2.54k stars 295 forks source link

ERROR, I use pytroch to train but I got this #51

Closed LordonCN closed 1 year ago

LordonCN commented 1 year ago
ccumulating evaluation results...
DONE (t=10.48s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
best_stat:  {'epoch': 0, 'coco_eval_bbox': 3.9432714116975085e-10}
lyuwenyu commented 1 year ago

It is normal when I train rtdetr-r18 on coco. I think you should give more information about what you have done , not just show some final result.

IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.151
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.232
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.160
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.095
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.182
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.199
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.220
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.400
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.490
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.245
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.516
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.714
best_stat:  {'epoch': 0, 'coco_eval_bbox': 0.15060986268807167}
LordonCN commented 1 year ago

It is normal when I train rtdetr-r18 on coco. I think you should give more information about what you have done , not just show some final result.

IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.151
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.232
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.160
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.095
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.182
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.199
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.220
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.400
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.490
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.245
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.516
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.714
best_stat:  {'epoch': 0, 'coco_eval_bbox': 0.15060986268807167}

I changed nothing but logger.py, I will check again, thanks.

LordonCN commented 1 year ago

I use single gpu to train:

python tools/train.py -c configs/rtdetr/rtdetr_r50vd_6x_coco.yml

output this:

Not init distributed mode.
Start training
Load PResNet50 state_dict
Initial lr:  [1e-05, 0.0001, 0.0001, 0.0001]
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
loading annotations into memory...
Done (t=0.49s)
creating index...
index created!
number of params: 42862860
Traceback (most recent call last):
  File "tools/train.py", line 38, in <module>
    main(args)
  File "tools/train.py", line 24, in main
    solver.fit()
  File "/home/aebuser/RT-DETR-main/rtdetr_pytorch/tools/../src/solver/det_solver.py", line 37, in fit
    train_stats = train_one_epoch(
  File "/home/aebuser/RT-DETR-main/rtdetr_pytorch/tools/../src/solver/det_engine.py", line 35, in train_one_epoch
    for samples, targets in metric_logger.log_every(data_loader, print_freq, header):
  File "/home/aebuser/RT-DETR-main/rtdetr_pytorch/tools/../src/misc/logger.py", line 238, in log_every
    header, total_time_str, total_time / len(iterable)))
ZeroDivisionError: float division by zero

datasets is here:

(rt-petr) aebuser@Precision-5820:~/RT-DETR-main/rtdetr_pytorch$ ls dataset/coco
annotations  images  labels  LICENSE  README.txt  test-dev2017.txt  train2017  train2017.txt  val2017  val2017.txt
lyuwenyu commented 1 year ago

Not init distributed mode. Start training Load PResNet50 state_dict Initial lr: [1e-05, 0.0001, 0.0001, 0.0001] loading annotations into memory... Done (t=18.14s) creating index... index created! loading annotations into memory... Done (t=0.57s) creating index... index created! number of params: 42862860 Epoch: [0] [ 0/29571] eta: 1 day, 3:30:36 lr: 0.000010 loss: 52.9165 (52.9165) loss_vfl: 0.3373 (0.3373) loss_bbox: 2.4941 (2.4941) loss_giou: 1.6976 (1.6976) loss_vfl_aux_0: 0.3209 (0.3209) loss_bbox_aux_0: 2.4534 (2.4534) loss_giou_aux_0: 1.7099 (1.7099) loss_vfl_aux_1: 0.2884 (0.2884) loss_bbox_aux_1: 2.4570 (2.4570) loss_giou_aux_1: 1.7213 (1.7213) loss_vfl_aux_2: 0.2807 (0.2807) loss_bbox_aux_2: 2.5275 (2.5275) loss_giou_aux_2: 1.7080 (1.7080) loss_vfl_aux_3: 0.2857 (0.2857) loss_bbox_aux_3: 2.4716 (2.4716) loss_giou_aux_3: 1.7132 (1.7132) loss_vfl_aux_4: 0.3352 (0.3352) loss_bbox_aux_4: 2.4743 (2.4743) loss_giou_aux_4: 1.6857 (1.6857) loss_vfl_aux_5: 0.3139 (0.3139) loss_bbox_aux_5: 2.5399 (2.5399) loss_giou_aux_5: 1.7607 (1.7607) loss_vfl_dn_0: 1.0065 (1.0065) loss_bbox_dn_0: 1.3145 (1.3145) loss_giou_dn_0: 1.2791 (1.2791) loss_vfl_dn_1: 0.9514 (0.9514) loss_bbox_dn_1: 1.3145 (1.3145) loss_giou_dn_1: 1.2791 (1.2791) loss_vfl_dn_2: 0.9279 (0.9279) loss_bbox_dn_2: 1.3145 (1.3145) loss_giou_dn_2: 1.2791 (1.2791) loss_vfl_dn_3: 0.9253 (0.9253) loss_bbox_dn_3: 1.3145 (1.3145) loss_giou_dn_3: 1.2791 (1.2791) loss_vfl_dn_4: 0.9912 (0.9912) loss_bbox_dn_4: 1.3145 (1.3145) loss_giou_dn_4: 1.2791 (1.2791) loss_vfl_dn_5: 0.9766 (0.9766) loss_bbox_dn_5: 1.3145 (1.3145) loss_giou_dn_5: 1.2791 (1.2791) time: 3.3491 data: 0.7294 max mem: 2441


- Make sure you have non-empty training data. 
header, total_time_str, total_time / len(iterable)))

ZeroDivisionError: float division by zero


<img width="590" alt="image" src="https://github.com/lyuwenyu/RT-DETR/assets/17582080/6d213e01-dab1-41b9-a060-39000aaa2bb9">
LordonCN commented 1 year ago

It works, thanks!