[Feature]: microsoft/Phi-3-vision-128k-instruct Vision support

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

https://docs.vllm.ai

Apache License 2.0

29.85k stars 4.51k forks source link

[Feature]: microsoft/Phi-3-vision-128k-instruct Vision support #4958

Closed pseudotensor closed 4 months ago

pseudotensor commented 5 months ago

🚀 The feature, motivation and pitch

https://huggingface.co/microsoft/Phi-3-vision-128k-instruct

Alternatives

No response

Additional context

vllm is somewhat behind in vision support. idefics2 is supported by TGI and lllava next been out for months and not supported yet. There is a PR, is it close?

Isotr0py commented 5 months ago

The vllm's multi-modality support is still under refactoring:

4194

So we need waiting some necessary refactoring work (like ImageProcessor support) finished before we add new vision model.

vllm-project / vllm

[Feature]: microsoft/Phi-3-vision-128k-instruct Vision support #4958

🚀 The feature, motivation and pitch

Alternatives

Additional context

4194