OpenGVLab/InternViT-6B-448px-V1-5 as Zero Shot Image Classification.

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型

MIT License

3.91k stars 299 forks source link

Hi,

Thanks for sharing the model and code with us.

I am trying to using Vision Foundation Model for a zero shot classification problem.

It is possible with OpenGVLab/InternVL-14B-224px but I am not able to do with OpenGVLab/InternViT-6B-448px-V1-5.

model = AutoModel.from_pretrained('OpenGVLab/InternViT-6B-448px-V1-5', torch_dtype=torch.bfloat16, low_cpu_mem_usage=True, trust_remote_code=True).cuda().eval()

tokenizer = AutoTokenizer.from_pretrained('OpenGVLab/InternViT-6B-448px-V1-5', use_fast=False, add_eos_token=True, trust_remote_code=True)

Is there anyway to get the tokenizer for OpenGVLab/InternViT-6B-448px-V1-5?

OpenGVLab / InternVL

OpenGVLab/InternViT-6B-448px-V1-5 as Zero Shot Image Classification. #147