TRI-ML / prismatic-vlms

A flexible and efficient codebase for training visually-conditioned language models (VLMs)
MIT License
327 stars 93 forks source link

pretrain.py won't run with arguments #2

Closed rpgrainger-ai closed 4 months ago

rpgrainger-ai commented 4 months ago

I have followed the installation instructions but when I attempt to run the provided example command:

# Run from the root of the repository
torchrun --standalone --nnodes 1 --nproc-per-node 8 scripts/pretrain.py \
  --model.vision_backbone_id "dinosiglip-vit-so-384px" \
  --model.image_resize_strategy "letterbox" \
  --model.llm_backbone_id "vicuna-v15-7b" 

It returns an error:

"raise ParsingError(f"Expected a dict with a '{CHOICE_TYPE_KEY}' key for {cls}, got {raw_value}") draccus.utils.ParsingError: Expected a dict with a 'type' key for <class 'prismatic.conf.models.ModelConfig'>, got {'vision_backbone_id': 'dinosiglip-vit-so-384px', 'image_resize_strategy': 'letterbox', 'llm_backbone_id': 'vicuna-v15-7b'}"

The script seems to run without passing in "model" modifications.

Further investigation shows this error appears for any inputs to "model" or "dataset" but not the other components of PretrainConfig

siddk commented 4 months ago

Hey @rpgrainger-ai - I'm so sorry about this, I left off a line when I was first writing up the README. I just updated the instructions, here's how you'd run the above command instead:

# Run from the root of the repository
torchrun --standalone --nnodes 1 --nproc-per-node 8 scripts/pretrain.py \
  --model.type "one-stage+7b" \
  --model.model_id "<NAME OF NEW MODEL>" \
  --model.vision_backbone_id "dinosiglip-vit-so-384px" \
  --model.image_resize_strategy "letterbox" \
  --model.llm_backbone_id "vicuna-v15-7b" 

Here, model.type is important for identifying the base configuration that you want to build on top of; the full list of model types are available in our config file; by default, if you want to run the single-stage training pipeline we detail in our paper, you always want to start with "one-stage+7b" as the base configuration.

In addition, specifying --model.model_id separately will set the actual directory path that model logs and checkpoints will be written to (otherwise, it'll default to writing things under runs/one-stage+7b).

Thanks for flagging this, and please let me know if you have any other trouble getting things running!

sahilqure commented 1 month ago

@siddk In the readme write the detail description about all the args which is required for fine-tuning and pertaining.