[Feature Request] Add `Phi-3.5-vision-instruct` VLM model

camel-ai / camel

🐫 CAMEL: Finding the Scaling Law of Agents. The first and the best multi-agent framework. https://www.camel-ai.org

https://docs.camel-ai.org/

Apache License 2.0

5.63k stars 684 forks source link

[Feature Request] Add `Phi-3.5-vision-instruct` VLM model #849

Open lightaime opened 2 months ago

lightaime commented 2 months ago

Required prerequisites

[X] I have searched the Issue Tracker and Discussions that this hasn't already been reported. (+1 or comment there if it has.)
[ ] Consider asking first in a Discussion.

Motivation

Add Phi-3.5-vision-instruct VLM model

Solution

No response

Alternatives

No response

Additional context

No response

tom-doerr commented 2 months ago

Would be interested in helping with this but I'm not quite sure what the issue currently is. Phi-3.5-vision-instruct is supported by vLLM right? So would the goal be to add image support to Camel in general?

CaelumF commented 2 months ago

Working with vLLM or ollama is fine, potentially some work is needed on the chat formatting and multi modal support