Open iavinas opened 2 months ago
Hi, the OpenGVLab/InternViT-6B-448px-V1-5
is a vision encoder extracted from the pretraining stage of the multimodal large language model (MLLM), OpenGVLab/InternVL-Chat-V1-5
, which is trained to be a specialized vision encoder for MLLM and cannot be used directly for zero-shot classification tasks.
Hi,
Thanks for sharing the model and code with us.
I am trying to using Vision Foundation Model for a zero shot classification problem.
It is possible with OpenGVLab/InternVL-14B-224px but I am not able to do with OpenGVLab/InternViT-6B-448px-V1-5.
Is there anyway to get the tokenizer for OpenGVLab/InternViT-6B-448px-V1-5?