Open CedricHwong opened 2 weeks ago
Now the openai api applies the chat template and tokenizes it at the same time to avoid some double BOS issues. https://github.com/sgl-project/sglang/blob/23cc66f7b65f885969d4608fd4964e0ba98fb7f5/python/sglang/srt/openai_api/adapter.py#L868
So the server does not see the input text. If you want this feature, can you help us support it? You can add some logging in the openai api server to print the raw input if a flag is set.
Checklist
Describe the bug
I checked the docker logs and tried to find the request text in the logs, but the logs showed text=None, but input_ids was returned. I want it to display the request text directly. What parameters should I add when starting it?
docker Logs:
in=GenerateReqInput(text=None, input_ids=[151644, 8948, 198, 2610, 525, 264, 10950, 17847, 13, 151645, 198, 151644, 872, 198, 108386, 151645, 198, 151644, 77091, 198], image_data=None, sampling_params={'temperature': 0.0, 'max_new_tokens': None, 'min_new_tokens': 0, 'stop': [], 'stop_token_ids': [], 'top_p': 1.0, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'repetition_penalty': 1.0, 'regex': None, 'n': 1}, rid='640143780ce94c81a56689020f8a5b09', return_logprob=False, logprob_start_len=-1, top_logprobs_num=0, return_text_in_logprobs=True, stream=False, modalities=[], is_single=True, lora_path=None), out={'text': '你好,有什么我可以帮助你的吗?', 'meta_info': {'prompt_tokens': 20, 'completion_tokens': 9, 'completion_tokens_wo_jump_forward': 9, 'finish_reason': {'type': 'stop', 'matched': 151645}, 'id': '640143780ce94c81a56689020f8a5b09'}, 'index': 0} INFO: 172.18.0.1:59398 - "POST /v1/chat/completions HTTP/1.1" 200 OK
Reproduction
docker run:
docker run -itd --name n72 --runtime nvidia --gpus '"device=0,1,2,6"' \ -p 1090:30000 \ -v /mnt/data1/home/fusion_large:/32k \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server --model-path /32k --host 0.0.0.0 --port 30000 --dtype bfloat16 --tensor-parallel-size 4 --served-model-name cosmic-32k --log-requests
Environment
Image: sglang v0.3.0
OS: cat /etc/os-release NAME="TencentOS Server" VERSION="2.4" ID="tencentos" ID_LIKE="rhel fedora centos tlinux" VERSION_ID="2.4" PRETTY_NAME="TencentOS Server 2.4" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:tencentos:tencentos:2" HOME_URL="https://cloud.tencent.com/product/ts"