salesforce / BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
BSD 3-Clause "New" or "Revised" License
4.85k stars 648 forks source link

fine tune with retrieval_coco.yaml #26

Closed BlueCat7 closed 2 years ago

BlueCat7 commented 2 years ago

Dear Author, Thanks for your amazing work. And i have a question about fine-tuning with retrieval_coco.yaml. Why use https://github.com/salesforce/BLIP/blob/main/configs/retrieval_coco.yaml#:~:text=pretrained%3A%20%27https%3A//storage.googleapis.com/sfr%2Dvision%2Dlanguage%2Dresearch/BLIP/models/model_base_retrieval_coco.pth%27 in retrieval_coco.yaml instead of https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base_14M.pth or https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base.pth. Thanks.

LiJunnan1992 commented 2 years ago

Hi, thanks for your question. As said in our instruction, the 'pretrained' field in the yaml file need to be changed for finetuning.

BlueCat7 commented 2 years ago

Ok, thanks!