TinyLLaVA / TinyLLaVA_Factory

A Framework of Small-scale Large Multimodal Models
https://arxiv.org/abs/2402.14289
Apache License 2.0
661 stars 69 forks source link

About the Vision Tower. #111

Open Sootung opened 3 months ago

Sootung commented 3 months ago

Hello, Where can I find the pretrained vision tower, the original one seems to have been fine-tuned? Such as SigLip, CLIP

YingHuTsing commented 3 months ago

Our vision towers are loaded from huggingface, which are pretrained ones, not finetuned ones