Question about the Qformer training

TobiasLee commented 1 year ago

Hi, big thanks for your great work and open-sourced code & weights. I am fine-tuning/continuing training the checkpoints and hope you can kindly share some knowledge about the design & training details:

what's the average loss of the final checkpoint for OPT decoder, i.e., the captioning loss. I'd like to use this to check whether my model is converging.
did u have a plan to release some checkpoints of the first stage pre-training, as the Qformer trained after the first stage is model-agnostic?

zzhanghub commented 1 year ago

The ckpt of first stage pre training would be very helpful. I also look forward to its release.

LiJunnan1992 commented 1 year ago

We have released stage-1 checkpoints with both ViT-g and ViT-L.

You could run BLIP-2 stage1 pre-training from scratch now with bash run_scripts/blip2/train/pretrain_stage1.sh

Thank you.

salesforce / LAVIS

Question about the Qformer training #250