salesforce / BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
BSD 3-Clause "New" or "Revised" License
4.86k stars 648 forks source link

About the zeroshot video-text retrieval #63

Open cdqncn opened 2 years ago

cdqncn commented 2 years ago

Dear author, Thanks for your great work! I'd like to know something about the zeroshot video-text retrieval result, is it fine-tuning with coco on a 14M BLIP or a 129M BLIP?

LiJunnan1992 commented 2 years ago

it is on 129M BLIP, thanks

cdqncn commented 2 years ago

Thanks again, I learned a lot from BLIP.