salesforce / ALBEF

Code for ALBEF: a new vision-language pre-training method
BSD 3-Clause "New" or "Revised" License
1.45k stars 193 forks source link

RefCOCO+ Fine-tuning #127

Open leizhu-angus opened 1 year ago

leizhu-angus commented 1 year ago

Does the ALBEF model support fine-tuning RefCOCO+ in a fully supervised setting for visual grounding tasks?