IDEA-Research / DINO

[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
Apache License 2.0
2.19k stars 243 forks source link

About the token Selection #170

Open loseevaya opened 1 year ago

loseevaya commented 1 year ago

Nice work! When selecting tokens from the encoder output, the output dimension of the class_embedding is 91, which includes the category of "no object". Will the tokens selected in this way have an impact on the results?

SlongLiu commented 1 year ago

We use focal loss, where no "no object" token exists. Or you can view it as multiple binary classifications.