Hi!
A couple of questions:
(1) What is the best way to use blip2 as a feature extractor for image-text retrieval? I did not see the same interface for blip2 here as the original blip.
(2) Are there any metrics for single stage retrieval (text-image) for blip2 without using fusion encoder reranking?
Hi! A couple of questions: (1) What is the best way to use blip2 as a feature extractor for image-text retrieval? I did not see the same interface for blip2 here as the original blip. (2) Are there any metrics for single stage retrieval (text-image) for blip2 without using fusion encoder reranking?
Thanks!