YifanXu74 / MQ-Det

Official PyTorch implementation of "Multi-modal Queried Object Detection in the Wild" (accepted by NeurIPS 2023)
Apache License 2.0
258 stars 12 forks source link

关于模型推理过程中视觉查询特征与文本特征的融合增强 #51

Open Real-UtopiaNo opened 5 months ago

Real-UtopiaNo commented 5 months ago

文章中提到,在训练阶段,增强后的视觉查询的特征会和对应的文本特征进行交叉注意力融合。那么在验证阶段,视觉查询特征如何选择和哪个文本特征进行融合增强?期待您的回复! image