Open alex8937 opened 2 years ago
Can the author confirm how the recall is implemented for both text to image and image to text given there are 5 captions per image?
Please check this: https://github.com/openai/CLIP/issues/115
Can the author confirm how the recall is implemented for both text to image and image to text given there are 5 captions per image?