fundamentalvision / Deformable-DETR

Deformable DETR: Deformable Transformers for End-to-End Object Detection.
Apache License 2.0
3.15k stars 513 forks source link

Number of the classes when defining the classifer #98

Closed davidnvq closed 2 years ago

davidnvq commented 2 years ago

Thank you for a great work. I'm a little bit confused by how you defined the number of the classes. As the original idea of DETR, they use num_classes + 1 as the output of the category classifier. For example, there is 91 classes in COCO, so the classifier outputs 92 classes with the final one is no-object category. I also see you wrote the comment in the forward method:

https://github.com/fundamentalvision/Deformable-DETR/blob/11169a60c33333af00a4849f1808023eba96a931/models/deformable_detr.py#L121

However, you indeed defined the category classifier with num_classes in the __init__ method. https://github.com/fundamentalvision/Deformable-DETR/blob/11169a60c33333af00a4849f1808023eba96a931/models/deformable_detr.py#L54

Can you explain more about this? Am I missing something.? Thank you a lot.

mordechail commented 2 years ago

hey,

maybe num_classes is the number of classes + 1 (class 0 for bg)?

see also my issue here: https://github.com/fundamentalvision/Deformable-DETR/issues/102#issue-1018506355 if its actually number of classes + 1 it also solves my issue.

davidnvq commented 2 years ago

Hi there, there is no non-object class here. Class 0 is also a category, although it is mapped to None as 91-class convention in COCO. I asked this question when seeing the comment in the code said class + 1, that made me confused. But I believe the author used the code base of DETR and they forgot to update the comment.