Closed joshmyersdean closed 6 months ago
Hi, thanks for your interest. No strict structure limit. You can directly download the Griffon checkpoint and CLIP checkpoint from the Hugging face. Then, modify the vision model path in the 'config.json' in the Griffon checkpoint to where you place the CLIP checkpoint, also the Griffon checkpoint with 'llava' inserted as this phrase is needed to set the conv template for llava. Then, follow these command to modify the CLIP checkpoint.
# resize the clip model to 448 to get the preprocessor
python tools/resize_clip_pos.py --model-path checkpoints/clip-vit-large-patch14 --new-size 448 --patch-size 14 --save-path checkpoints/clip-vit-large-patch14-448
# replace the config and preprocess_config
cp tools/preprocessor_config.json checkpoints/clip-vit-large-patch14-448/
cp tools/config.json checkpoints/clip-vit-large-patch14-448/
Hi @jefferyZhan
I am wondering what does also the Griffon checkpoint with 'llava' inserted as this phrase is needed to set the conv template for llava
exactly mean? What should I do to pre-process the Griffon models download from HF?
Thanks for the awesome work!
What is the expect structure of the checkpoints directory? Also for the CLIP conversion, what file specifically do we convert?
Thank you!