Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
Is there a subsequent version that can support the Dash-infer inference framework?
Motivation / 动机
I feel that the Dash-infer inference framework can get better performance than llama.cpp inference frameworks, but compatibility can be a bottleneck.
https://github.com/modelscope/dash-infer
Feature request / 功能建议
Is there a subsequent version that can support the Dash-infer inference framework?
Motivation / 动机
I feel that the Dash-infer inference framework can get better performance than llama.cpp inference frameworks, but compatibility can be a bottleneck. https://github.com/modelscope/dash-infer