As title, lavis just released a new vision-language instruction-tuning framework using BLIP-2 models, achieving state-of-the-art zero-shot generalization performance on a wide range of vision-language tasks. https://github.com/salesforce/LAVIS/tree/main/projects/instructblip
As title, lavis just released a new vision-language instruction-tuning framework using BLIP-2 models, achieving state-of-the-art zero-shot generalization performance on a wide range of vision-language tasks. https://github.com/salesforce/LAVIS/tree/main/projects/instructblip