InternLM / InternLM-XComposer

InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.
1.91k stars 120 forks source link

Get the embeddings of the image. #307

Open xinyanghuang7 opened 1 month ago

xinyanghuang7 commented 1 month ago

Thank you very much for contributing such an excellent model!

If I want to input a picture and obtain the embedding provided by InternLM-XComposer2-VL-7B, how should I do it?

Can you help me implement it with a few simple lines of code?

Looking forward to your reply!

Thanks!