[Bug] When using sglang as the inference framework, if a word starting with "\n" appears in the stop parameter, the sglang will Missing '\n' during inference #956
[X] 1. I have searched related issues but cannot get the expected help.
[X] 2. The bug has not been fixed in the latest version.
[X] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
Describe the bug
When using sglang as the inference framework, if a word starting with "\n" appears in the stop parameter, the sglang will not wrap during inference。
EG:
prompt = 请换行输出1-10个数字
stop = ['<|endoftext|>', '<|im_end|>', '<|im_start|>']
1
2
3
4
5
6
7
8
9
10
Checklist
Describe the bug
When using sglang as the inference framework, if a word starting with "\n" appears in the stop parameter, the sglang will not wrap during inference。 EG: prompt = 请换行输出1-10个数字 stop = ['<|endoftext|>', '<|im_end|>', '<|im_start|>'] 1 2 3 4 5 6 7 8 9 10
prompt = 请换行输出1-10个数字 stop = ['\n<|endoftext|>', '<|im_end|>', '<|im_start|>'] 12345678910
"\n" can be followed by any character, and there will be no line break.
Reproduction
OS: Linux x64
GPU: A100 python:3.10 sglang:0.2.7 LLM model: Qwen2-72B-lora-awq-4bit cmd: python -m fastchat.serve.controller --host localhost --port 44000
python -m fastchat.serve.vllm_worker --model-path ${MODEL_PATH} --max-model-len 8192 --worker-address "http://0.0.0.0:22006" --port 22006 --model-names "qwen-latest" --controller-address "http://localhost:44000"
python -m fastchat.serve.openai_api_server --host 0.0.0.0 --port 21003 --controller-address "http://localhost:44000"
Then run code """ def test_open_ai(prompt: str, stream: bool = False, model: str = "qwen-latest"):
"""
Environment