when using qwen-7b-chat,and openai completion api ,i have given stop tokens like ["<|im_end|>","<|endoftext|>"],but it always stop when reach the max length limit.i have check the generation ,and find it doesn't generate eos token. so the generation contains many unexpected contents
To reproduce
import openai
client = openai.OpenAI(
base_url='http://localhost:3000/v1', api_key='na'
) # Here the server is running on localhost:3000
Describe the bug
when using qwen-7b-chat,and openai completion api ,i have given stop tokens like ["<|im_end|>","<|endoftext|>"],but it always stop when reach the max length limit.i have check the generation ,and find it doesn't generate eos token. so the generation contains many unexpected contents
To reproduce
import openai
client = openai.OpenAI( base_url='http://localhost:3000/v1', api_key='na' ) # Here the server is running on localhost:3000
models = client.models.list() print('Models:', models.model_dump_json(indent=2)) model = models.data[0].id completions = client.completions.create( prompt='<|im_start|>system\n{system_message}<|im_end|>\n<|im_start|>user\n介绍一下南京<|im_end|>\n<|im_start|>assistant\n', model=model, max_tokens=512, stream=False,stop=["<|im_end|>","<|endoftext|>"] )
print(completions)
Logs
Environment
transformers:4.35.2 python:3.10
System information (Optional)
No response