xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
4.94k stars 391 forks source link

输出的时候,应该回答一半就会停止,给人的感觉就是没有没有回答完整。是上下文长度的原因吗 #2359

Open Joker-sad opened 4 days ago

Joker-sad commented 4 days ago

System Info / 系統信息

python == 3.9 Name: vllm Version: 0.5.0.post1

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

Version info / 版本信息

Name: xinference Version: 0.15.2

The command used to start Xinference / 用以启动 xinference 的命令

q

Reproduction / 复现过程

q

Expected behavior / 期待表现

希望这个问题会有一个解决方案,或者模型上下文的长度可以通过一个参数去设置

SDAIer commented 3 days ago

通过什么调用的?maxcontent maxresponse怎么设置的