npuichigo / openai_trtllm

OpenAI compatible API for TensorRT LLM triton backend
MIT License
176 stars 27 forks source link

Multiple stop_words does not work #50

Closed Raphael-Jin closed 3 months ago

Raphael-Jin commented 4 months ago

When I use this curl to hit the API, it works:

curl -XPOST http://127.0.0.1:3000/v1/completions -H "Content-Type: application/json" -d '{
"model": "ensemble",
"prompt": "How is Meta against Google?",
"stop": [","],
"stream": true
}'

However, when I use:

curl -XPOST http://127.0.0.1:3000/v1/completions -H "Content-Type: application/json" -d '{
"model": "ensemble",
"prompt": "How is Meta against Google?",
"stop": [",", "."],
"stream": true
}'

It failed saying:

{"timestamp":"2024-07-31T22:11:23.692303Z","level":"ERROR","message":"received error message from triton: [request id: <id_unknown>] expected 1 string elements for inference input 'stop_words', got 2","target":"openai_trtllm::routes::completions"}

Any suggestion about this error?

Raphael-Jin commented 4 months ago

Oh I figure out the solution, that the shape needs to be change accordingly here: https://github.com/npuichigo/openai_trtllm/blob/main/src/routes/completions.rs#L222

Raphael-Jin commented 4 months ago

Maybe it can be set as the length of the list automatically.

Raphael-Jin commented 4 months ago

Create a PR for ref. Please review.

https://github.com/npuichigo/openai_trtllm/pull/51