xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
4.56k stars 357 forks source link

FEAT: OpenAI drop in replacement for /v1/completions (i.e. chat completions). #564

Closed bmwas closed 2 weeks ago

bmwas commented 10 months ago

Is your feature request related to a problem? Please describe

To fully leverage an opensource language model, I would like to tap into langchain's ChatOpenAI in a drop manner (i.e. as it is). Langchain does provide drop in replacements with the vLLM package.

https://python.langchain.com/docs/integrations/chat/vllm

Describe the solution you'd like

Integrate ChatOpenAI drop in alternative provided with vLLM (see link above).

Describe alternatives you've considered

Self hosting vLLM and exposing an endpoint but I consider the xinference approach much superior especially with the model registry.

Additional context

https://python.langchain.com/docs/integrations/chat/vllm https://python.langchain.com/docs/integrations/llms/openai

codingl2k1 commented 10 months ago

take

aresnow1 commented 10 months ago

We have created a pull request: https://github.com/langchain-ai/langchain/pull/12702, waiting for it to be merged!

github-actions[bot] commented 2 weeks ago

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] commented 2 weeks ago

This issue was closed because it has been inactive for 5 days since being marked as stale.