salesforce / LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence
BSD 3-Clause "New" or "Revised" License
9.69k stars 950 forks source link

Fine-tuning InstructBLIP? #302

Open alpayariyak opened 1 year ago

iamwangyabin commented 1 year ago

It seems that their fine-tuning strategy is similar to the standard training approach for VQA. I noticed that the blip2_vicuna_instruct.py file includes a predict_answers function, which is commonly used in VQA tasks.

To use their approach, you can prepare your datasets as they've described, including image, text_input, and text_output, and then launch train.py. However, I would also like to see more training details to better understand their methodology.

chloejiang commented 1 year ago

It seems the run_scripts do not include anything related to InstuctBlip. Will the official release the code for pre-training and fine-tuning of InstructBlip?

austinmw commented 1 year ago

Bump, could you expand on the Instruction-Tuning section for the InstructBLIP model page? It's not clear how to do this or even if the full code necessary has been released.

edchengg commented 1 year ago

Could you release VQA finetuning script for instructBLIP?

aopolin-lv commented 1 year ago

Could you release VQA finetuning script for instructBLIP?

meet the same problem

hangzeli08 commented 1 year ago

same question

qwqwq1445 commented 1 year ago

same question

tigerzjh commented 1 year ago

same question,how to fine tune instruct blip?

floriankark commented 1 year ago

I would also like to know how to fine tune instruct blip

Richar-Du commented 1 year ago

Could the author provide the fine-tuning script of InstructBLIP? @LiJunnan1992

Tower0823 commented 1 year ago

mark!

dydxdt commented 1 year ago

mark

g2zr004 commented 1 year ago

mark

Oklahomawhore commented 1 year ago

mark!

sdc17 commented 1 year ago

same question here

lxmcwt commented 12 months ago

mark

gwyong commented 11 months ago

same question

control-spiderman commented 11 months ago

+1

liu3xing3long commented 11 months ago

+1, pretrain vicuna at stage2 also need to be modified

Lanyu0303 commented 10 months ago

mark

him-mah10 commented 10 months ago

+1

santaboi commented 9 months ago

mark

Clement25 commented 9 months ago

mark

Yuancheng-Xu commented 9 months ago

Guys, are there any training scripts for InstructBLIP on captioning (not VQA) tasks? Something like https://github.com/salesforce/LAVIS/blob/main/run_scripts/blip2/train/train_caption_coco.sh but for InstructBLIP?

owlsan49 commented 9 months ago

mark

idor980 commented 8 months ago

mark

dszpr commented 8 months ago

mark!

findalexli commented 7 months ago

https://github.com/AttentionX/InstructBLIP_PEFT?tab=readme-ov-file It semms like the authors have no motivatioon to release the finetuning script, but here is a repo that claims to do the same

waitzkin commented 7 months ago

We have released the finetuning scripts, so let me know if you have any problem!

Clement25 commented 7 months ago

We have released the finetuning scripts, so let me know if you have any problem!

Could you please tell me where it is? I found no finetuning scripts for instructblip under run_scripts/train/BLIP2

waitzkin commented 7 months ago

As in Train > Run Script section in README, it's under run_scripts/instructblip/train.

Clement25 commented 7 months ago

As in Train > Run Script section in README, it's under run_scripts/instructblip/train.

Hi waitzkin, thanks for your reply. Could you please tell me which branch you are on? On the main branch there's no folder named 'instructblip' in run_scripts folder.

waitzkin commented 7 months ago

Main branch is the correct branch, and the script is here. https://github.com/AttentionX/InstructBLIP_PEFT/tree/main/run_scripts/instructblip/train The instructblip directory only has one sub directory 'train', so it is denoted as instructblip/train here(https://github.com/AttentionX/InstructBLIP_PEFT/tree/main/run_scripts), and this might have confused you.

lsnls commented 6 months ago

mark