EricLBuehler / mistral.rs

Blazingly fast LLM inference.
MIT License
3.59k stars 255 forks source link

Phi-3.5-vision-Instruct muliples images loading #795

Open Aveline67 opened 1 week ago

Aveline67 commented 1 week ago

How is it possible to load multiples images for Phi-3.5-vision-Instruct ?

And referencing them as Image ?

Maybe it is supported but now example to show how.

Aveline67 commented 6 days ago

By modifying code I was able to load 2 pictures and it seems to work

EricLBuehler commented 6 days ago

@Aveline67 can you please share the code? Phi 3.5 vision instruct can support multiple images, just add messages with the correlated image!

Aveline67 commented 5 days ago

I just did call multiple times .add_phiv_image_message() but I had to comment out the candle_core::bail!("Can only process one image per batch"); condition in mistralrs-core\src\vision_models\phi3_inputs_processor.rs

I am looking to create a proper PR, also some changes are needed in phi3.rs to ensure to pass all pictures dimensions

EricLBuehler commented 3 days ago

@Aveline67 I see! It looks like this should be a fix as well as what you mentioned. Please feel free to open a PR!