xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
4.69k stars 369 forks source link

Logit_bias Error #1924

Closed Rookie-Kai closed 1 month ago

Rookie-Kai commented 1 month ago

System Info / 系統信息

vllm==0.4.2 torch==2.3.0 python==3.10

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

Version info / 版本信息

0.12.0

The command used to start Xinference / 用以启动 xinference 的命令

xinference-local --host 0.0.0.0 --port 9997

Reproduction / 复现过程

response = client.chat.completions.create( model="Qwen2-72B-Instruction", messages=message, temperature=0, max_tokens=512, logit_bias=logit_bias )

Expected behavior / 期待表现

我在client.chat.completions.create()中使用logit_bias参数时, 返回Error code: 501 - {'detail': 'Not implemented'} 我观察到 #1510 中指出logit_bias尚未实现,请问后续有更新计划么? 还是说当前已支持logit_bias,只是我的调用方法错误?

qinxuye commented 1 month ago

We will do some investigation to see if it's possible to support logit_bias.