dvlab-research / MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Apache License 2.0
3.21k stars 278 forks source link

May I ask if the current inference code does not support multi images input #114

Open Angelalilyer opened 6 months ago

Angelalilyer commented 6 months ago

Although the code states that if multiple input images can be separated by ',', the code reports many errors. After adjusting the input to the appropriate shape, the output is also empty.

Has anyone tried to input multiple images?

haochuan-li commented 5 months ago

Same problem. Any update on this?