IDEA-Research / DINO

[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
Apache License 2.0
2.1k stars 230 forks source link

Loss cardinality does not make sense in the setting when using focal loss #201

Open RomStriker opened 1 year ago

RomStriker commented 1 year ago

Hi,

I see that starting from deformable DETR, it doesn't make sense to have cardinality loss. In the original DETR paper, the class_embed head outputs num_classes + 1, to also output a value for the no object class, and if we keep the same structure then the Cardinality loss specifically the line would work as commented in the code. However, in all the versions of DETR from IDEA-Research that uses focal loss, uses class_embed head that only outputs num_classes, so no output for the no object class, and then the Cardinality loss instead considers the last class the no object class. Can you explain its use?

Also, why did you make the choice to not use the no-object class for focal loss, is there some explanation that I am missing?

Chrazqee commented 11 months ago

I have the same question. I just understand that if add no-object it will cause category imbalance problem. I know the focal loss is aimmed to solve the problem, but I find the prediction still closse to no-object class. so I feel confused a lot. Do you understand how to explain the question? if so, can you explain it for me?

Thanks!