Closed Tsardoz closed 2 weeks ago
I have the same issue. I tried to feed the model multiple images, and the answer I got was "image encoder error". I look at the code of chat.py and found that the chat method in the MiniCPMV class only accepts a single image. I am also curious whether the model has the ability to read multiple images at the same time for conversation like GPT4.
hi, this is a very good try. it is capable of inputting multiple images. But of course, it wasn't trained on video scenarios, which leads to the fact that he may not be very good. You can have a try. please refer to this link https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5/discussions/2
Does it work with multiple frames? I tried reading sequential frames froma folder, converting to base64 and appending but I get an error when using chat_model.chat(inputs). Is this supported? test_video.txt