Closed XKCUW closed 2 months ago
https://github.com/open-compass/VLMEvalKit/blob/main/vlmeval/vlm/xcomposer/xcomposer2_4KHD.py#L60 you could define the text and image, and the image resolution freely with this function, this repo also supports many mainstream VQA benchmarks.
I can use "vis_processor" method to get the embedding of the image manually(through url) on, like the method used in .
(refer: https://huggingface.co/internlm/internlm-xcomposer2-7b).
However, I tried the same way on , it doesn't work.
could you give some examples to solve the issue?
@panzhang0212 @yhcao6