-
用 openai_api.py 运行API服务的时候报错,服务可以启成功,但是发送对话时报错
模型选择的是:Qwen1.5-14B-Chat
python3 openai_api.py --checkpoint-path D:\projects\Qwen\models\Qwen1.5-14B-Chat --server-port 8052
`INFO: 47.90.xxx.xxx…
-
Starting the server and doing inference used to work, but letting sglang select from choices caused it to hang. I updated from 0.1.11 to 0.1.12 and now the server doesn't start anymore:
```
model=…
-
We are working on building a model that outputs a list of outputs per input context. In LangChain, we can use output parsers to enforce list-like formats as follows:
```py
from langchain_core.pydant…
-
Hi!
Should zmq be on your dependencies list?
```
2024-02-08 09:56:23 | ERROR | stderr | Traceback (most recent call last):
2024-02-08 09:56:23 | ERROR | stderr | File "/p/haicluster/llama/Fa…
surak updated
8 months ago
-
Hi team! My generation scenario involves rolling back and I was wondering how I could speed this up using sglang.
In the first stage, I have an initial prompt, and I can obtain an output with sent…
-
Using 8xA10s
`!python -m sglang.launch_server --model-path /local_disk0/dillonlaird/hf-llava-v1.6-34b --host 0.0.0.0 --port 1234 --tp 8 --model-mode flashinfer`
Trace
```
The cache for model…
-
when I try to use `sglang` locally according to README.md:
``` sh
python -m sglang.launch_server --model-path NousResearch/Llama-2-7b-chat-hf --port 30000
```
(I use NousResearch/Llama-2-7b-chat-h…
-
**Describe the bug**
Chat completions endpoint does not parse model outputs correctly in selected prompt formats and continues generating after hitting the end tokens.
**To Reproduce**
Steps to r…
-
### System Info / 系統信息
centos 7.9
python3.10.6
pip3 list
Package Version
----------------------------- --------------
absl-py 2.1.0
accelerate …
-
I'm using the following code to start a runtime from a Jupyter Notebook:
```python
from sglang import function, gen, set_default_backend, Runtime, RuntimeEndpoint
model_name = "TheBloke/deepsee…