jshilong / DDQ

Dense Distinct Query for End-to-End Object Detection (CVPR2023)
Apache License 2.0
244 stars 6 forks source link

Loss = 0 and AP = -1 #8

Closed chenhbo closed 1 year ago

chenhbo commented 1 year ago

Hi, congratulations on the acceptance! And thanks for sharing this nice work.

I'm training on a custom dataset, which only has two classes. But the loss is always 0 and the AP is always -1.

Could you give some suggestions?

05/15 18:42:11 - mmengine - INFO - Epoch(train) [1][ 50/100] lr: 4.9220e-06 eta: 0:05:28 time: 0.2854 data_time: 0.0033 memory: 4989 grad_norm: 1.1815 loss: 0.0139 loss_cls: 0.0011 loss_bbox: 0.0000 loss_iou: 0.0000 d0.loss_cls: 0.0002 d0.loss_bbox: 0.0000 d0.loss_iou: 0.0000 d1.loss_cls: 0.0005 d1.loss_bbox: 0.0000 d1.loss_iou: 0.0000 d2.loss_cls: 0.0001 d2.loss_bbox: 0.0000 d2.loss_iou: 0.0000 d3.loss_cls: 0.0004 d3.loss_bbox: 0.0000 d3.loss_iou: 0.0000 d4.loss_cls: 0.0030 d4.loss_bbox: 0.0000 d4.loss_iou: 0.0000 enc_loss_cls: 0.0086 enc_loss_bbox: 0.0000 enc_loss_iou: 0.0000 dn_loss_cls: 0.0000 dn_loss_bbox: 0.0000 dn_loss_iou: 0.0000 d0.dn_loss_cls: 0.0000 d0.dn_loss_bbox: 0.0000 d0.dn_loss_iou: 0.0000 d1.dn_loss_cls: 0.0000 d1.dn_loss_bbox: 0.0000 d1.dn_loss_iou: 0.0000 d2.dn_loss_cls: 0.0000 d2.dn_loss_bbox: 0.0000 d2.dn_loss_iou: 0.0000 d3.dn_loss_cls: 0.0000 d3.dn_loss_bbox: 0.0000 d3.dn_loss_iou: 0.0000 d4.dn_loss_cls: 0.0000 d4.dn_loss_bbox: 0.0000 d4.dn_loss_iou: 0.0000 0_aux_aux_loss_cls: 0.0000 0_aux_aux_loss_bbox: 0.0000 1_aux_aux_loss_cls: 0.0000 1_aux_aux_loss_bbox: 0.0000 2_aux_aux_loss_cls: 0.0000 2_aux_aux_loss_bbox: 0.0000 3_aux_aux_loss_cls: 0.0000 3_aux_aux_loss_bbox: 0.0000 4_aux_aux_loss_cls: 0.0000 4_aux_aux_loss_bbox: 0.0000 5_aux_aux_loss_cls: 0.0000 5_aux_aux_loss_bbox: 0.0000 aux_enc_aux_loss_cls: 0.0000 aux_enc_aux_loss_bbox: 0.0000 05/15 18:42:25 - mmengine - INFO - Exp name: ddq-detr-4scale_r50_8xb2-12e_coco_20230515_184148 05/15 18:42:25 - mmengine - INFO - Epoch(train) [1][100/100] lr: 9.9240e-06 eta: 0:05:04 time: 0.2680 data_time: 0.0026 memory: 4669 grad_norm: 0.0031 loss: 0.0001 loss_cls: 0.0000 loss_bbox: 0.0000 loss_iou: 0.0000 d0.loss_cls: 0.0000 d0.loss_bbox: 0.0000 d0.loss_iou: 0.0000 d1.loss_cls: 0.0000 d1.loss_bbox: 0.0000 d1.loss_iou: 0.0000 d2.loss_cls: 0.0000 d2.loss_bbox: 0.0000 d2.loss_iou: 0.0000 d3.loss_cls: 0.0000 d3.loss_bbox: 0.0000 d3.loss_iou: 0.0000 d4.loss_cls: 0.0000 d4.loss_bbox: 0.0000 d4.loss_iou: 0.0000 enc_loss_cls: 0.0001 enc_loss_bbox: 0.0000 enc_loss_iou: 0.0000 dn_loss_cls: 0.0000 dn_loss_bbox: 0.0000 dn_loss_iou: 0.0000 d0.dn_loss_cls: 0.0000 d0.dn_loss_bbox: 0.0000 d0.dn_loss_iou: 0.0000 d1.dn_loss_cls: 0.0000 d1.dn_loss_bbox: 0.0000 d1.dn_loss_iou: 0.0000 d2.dn_loss_cls: 0.0000 d2.dn_loss_bbox: 0.0000 d2.dn_loss_iou: 0.0000 d3.dn_loss_cls: 0.0000 d3.dn_loss_bbox: 0.0000 d3.dn_loss_iou: 0.0000 d4.dn_loss_cls: 0.0000 d4.dn_loss_bbox: 0.0000 d4.dn_loss_iou: 0.0000 0_aux_aux_loss_cls: 0.0000 0_aux_aux_loss_bbox: 0.0000 1_aux_aux_loss_cls: 0.0000 1_aux_aux_loss_bbox: 0.0000 2_aux_aux_loss_cls: 0.0000 2_aux_aux_loss_bbox: 0.0000 3_aux_aux_loss_cls: 0.0000 3_aux_aux_loss_bbox: 0.0000 4_aux_aux_loss_cls: 0.0000 4_aux_aux_loss_bbox: 0.0000 5_aux_aux_loss_cls: 0.0000 5_aux_aux_loss_bbox: 0.0000 aux_enc_aux_loss_cls: 0.0000 aux_enc_aux_loss_bbox: 0.0000 05/15 18:42:25 - mmengine - INFO - Saving checkpoint at 1 epochs /media/New Volume/Project/DETR/DDQ-ddq_detr/mmdet/models/layers/positional_encoding.py:84: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). dim_t = self.temperature*(2 (dim_t // 2) / self.num_feats) /home/anaconda3/envs/detr/lib/python3.7/site-packages/torch/functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1646755953518/work/aten/src/ATen/native/TensorShape.cpp:2228.) return _VF.meshgrid(tensors, kwargs) # type: ignore[attr-defined] /media/New Volume/Project/DETR/DDQ-ddq_detr/mmdet/models/layers/transformer/utils.py:71: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). dim_t = temperature(2 (dim_t // 2) / num_feats) 05/15 18:42:31 - mmengine - INFO - Epoch(val) [1][50/50] eta: 0:00:00 time: 0.0446 data_time: 0.0018 memory: 723
05/15 18:42:31 - mmengine - INFO - Evaluating bbox... Loading and preparing results... DONE (t=0.01s) creating index... index created! Running per image evaluation... Evaluate annotation type
bbox* DONE (t=0.04s). Accumulating evaluation results... DONE (t=0.01s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = -1.000 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = -1.000 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = -1.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = -1.000 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = -1.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=300 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = -1.000

jshilong commented 1 year ago

Thank you for your interest. Have you confirmed that the annotation is correct? It may be helpful to try other detection methods in mmdetection to determine whether the issue is with our algorithm or not.

jshilong commented 1 year ago

Thanks for your reply :-). Yes, you are correct. It's because the class of my custom dataset is not included in mmdetection/coco.py. I add the new class and it works now.

By the way, in my view, DETR-DDQ aims to remove the redundancy detection using NMS inside the training and inference stage. But I found on my custom dataset, during the reference, there are still some multiple bboxes for the same object. Do you have any suggestions to improve it?

Thank you so much! 01

Do you check the corresponding score of each prediction? Under one-to-one assignment, there is only one positive sample, so there should be only one high-score bounding box for each ground truth. Then, you can filter redundant bounding boxes with a score threshold.

chenhbo commented 1 year ago

Thanks for your quick reply.

Yes, I checked the scores. As the figure shows, these bbox scores are around 20 (0.2) for this object. If I set the threshold as 0.3, these redundant bboxes can be filtered. But all the bboxes to this object will be removed under 0.3 threshold.

Actually, the total performance is quite good. Thx for your nice work! But, for some objects in my case, the classification score is too low. Is this because I only have two classes in my case? I'm trying to train longer time and reduce the number of queries. Thank you so much!

jshilong commented 1 year ago

Longer training time may solve your problem Moreover, if your dataset does not include crowd instances(two instances with high overlap), you might consider adjusting the iou_threshold of DQS to 0.7 or 0.6, which could directly remove redundant bounding boxes and help to enlarge the score gap between positive and negative samples. https://github.com/jshilong/DDQ/blob/a166d18658b6b5b57621c00d6aa04e52a80e65bd/projects/configs/ddq_detr/ddq-detr-4scale_r50_8xb2-12e_coco.py#L8

chenhbo commented 1 year ago

Thanks for your suggestions, 0.6 works well for me.