Closed volcanolee4 closed 2 years ago
OK! Thanks for your reply!
Feel confused about the necessarity of the indicator. It seems that the indicators (0 for learnable queries, 1 for known label embedding) are not used to distinguish matching parts and denoising parts, since attention masks can prevent information leakage. Anything misunderstood?
In the attention mask, the denoising part can still see the matching part, therefore the indicator helps to distinguish these two.
Yes, the indicator is not that important. In our experiment, we found that adding an indicator leads to small additional performance improvement, so we adopt this design. Thank you.
got it, thx.
Thanks for your excellent work! I have two questions about the label embedding:
1) For any query, it has its own one hot vector which has 81 dimensionalities (80 classes in COCO dataset and 1 for unknown class)? Than we can embed the one-hot vector to get a label embedding by an MLP? 2) The indicator which is used to differentiate between a denoising part query and the matching part query is 1 or zero, How to append the indicator to the label embedding? Just concatenate the scalar to the end of label embedding?