jefferyZhan / Griffon

【ECCV2024】The official repo of Griffon series
Apache License 2.0
93 stars 5 forks source link

Expected Structure of Checkpoints? #4

Closed joshmyersdean closed 6 months ago

joshmyersdean commented 6 months ago

Thanks for the awesome work!

What is the expect structure of the checkpoints directory? Also for the CLIP conversion, what file specifically do we convert?

Thank you!

jefferyZhan commented 6 months ago

Hi, thanks for your interest. No strict structure limit. You can directly download the Griffon checkpoint and CLIP checkpoint from the Hugging face. Then, modify the vision model path in the 'config.json' in the Griffon checkpoint to where you place the CLIP checkpoint, also the Griffon checkpoint with 'llava' inserted as this phrase is needed to set the conv template for llava. Then, follow these command to modify the CLIP checkpoint.

# resize the clip model to 448 to get the preprocessor
python tools/resize_clip_pos.py --model-path checkpoints/clip-vit-large-patch14 --new-size 448 --patch-size 14 --save-path checkpoints/clip-vit-large-patch14-448
# replace the config and preprocess_config
cp tools/preprocessor_config.json checkpoints/clip-vit-large-patch14-448/
cp tools/config.json checkpoints/clip-vit-large-patch14-448/
XuYunqiu commented 1 month ago

Hi @jefferyZhan I am wondering what does also the Griffon checkpoint with 'llava' inserted as this phrase is needed to set the conv template for llava exactly mean? What should I do to pre-process the Griffon models download from HF?