ZhangGongjie / Meta-DETR

[T-PAMI 2022] Meta-DETR for Few-Shot Object Detection: Official PyTorch Implementation
MIT License
385 stars 84 forks source link

Difference between Model Architecture in the Paper and the Actual Implementation #38

Open michaelku1 opened 2 years ago

michaelku1 commented 2 years ago

Hello, Thanks for your great work. I would like to point out that in the paper it looks to me that the query features are passed to the CAM module. However, in the actual implementation the query features did not play a role until the final encoder-decoder architecture.

For example, the category_code() also computes the categorical features using the support samples, whereas in the meta_detr.py module the query features are extracted from the backbone and are only interacted with the support features in the self.transformer(), which seems to be different from paper's architecture. I am wondering if something is missing? Thanks.

nhw649 commented 11 months ago

I have the same problem.

NHeLv1 commented 7 months ago

same question

NHeLv1 commented 7 months ago

I found class SingleHeadSiameseAttention seems to be the CAM module, not sure