xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
4.58k stars 359 forks source link

请问能将lora也纳入网页部署的选择内容吗 #1105

Closed xs818818 closed 2 weeks ago

xs818818 commented 5 months ago

我想在openai接口使用时,可以自由选择不同lora或者原版模型

xs818818 commented 5 months ago

我希望是在网页上部署类似效果 CUDA_VISIBLE_DEVICES=0 python -m vllm.entrypoints.openai.api_server \ --trust-remote-code \ --max-model-len 4096 \ --model ~/qwen/Qwen1.5-14B-Chat \ --enable-lora \ --lora-modules lora1=~/lora/xxx lora2=~/lora/xxx curl --request POST \ --url http://localhost:8000/v1/chat/completions \ --header 'content-type: application/json' \ --data '{ "model": "lora2", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "China is a" } ], "stop_token_ids": [151645, 151644, 151643], "max_tokens": 5, "temperature": 0.7 }'

ChengjieLi28 commented 5 months ago

@xs818818 , v0.9.2 支持了lora的集成,参考文档:https://inference.readthedocs.io/zh-cn/latest/models/lora.html 但是目前不对lora模型进行管理,目前是用户自行下载,与LLM image模型一起launch。

mxdlzg commented 1 month ago

现在UI里如果填了lora config,应该如何请求呢? OpenAI的接口。

github-actions[bot] commented 3 weeks ago

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] commented 2 weeks ago

This issue was closed because it has been inactive for 5 days since being marked as stale.