TRI-ML / prismatic-vlms

A flexible and efficient codebase for training visually-conditioned language models (VLMs)
MIT License
425 stars 194 forks source link

How to finetune starting from a Prismatic VLM checkpoint #20

Closed djghosh13 closed 4 months ago

djghosh13 commented 5 months ago

I'm looking to finetune your trained VLM models on a custom dataset. Aside from data format and hyperparameters, my main question is how would I go about loading one of the trained models (i.e., prism-dinosiglip+7b) into the training script? Are there existing arguments to scripts/pretrain.py, or do I need to make changes to the file itself? Thanks!

siddk commented 4 months ago

If you pass in the path to an existing checkpoint as the pretrained_checkpoint flag in the scripts/pretrain.py file, it'll start the finetuning procedure.

If you want to implement any custom logic, I recommend changing the stage flag as well, and implementing handling in PrismaticVLM and the datasets.py files!