clin1223 / VLDet

[ICLR 2023] PyTorch implementation of VLDet (https://arxiv.org/abs/2211.14843)
Other
175 stars 11 forks source link

image and text retrieval #2

Open wendy0527 opened 1 year ago

wendy0527 commented 1 year ago

Does VLDet support image and text retrieval? For example, my purpose is to give a text to retrieve the most matching image. If the model supports it, should I use the image embedding? Or each instance embedding? As far as I understand, should I use
proj_x = self.linear(input_x) [VLDet/vldet/modeling/roi_heads/zero_shot_classifier.py line98] as the image/instances embedding?

clin1223 commented 1 year ago

Thanks for your interests! VLDet currently does not support image and text retrieval. You can try to solve retrieval problems as the CLIP way.