intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc.
Apache License 2.0
6.47k stars 1.24k forks source link

[RCR] 多模态Serving框架支持 #11695

Open kevin-t-tang opened 1 month ago

kevin-t-tang commented 1 month ago

Please help to impl internlm-xcomposer2-vl-7b serving support on lightweight serving or some other frameworks.

hzjane commented 1 month ago

This feature will support in this PR