npuichigo / openai_trtllm

OpenAI compatible API for TensorRT LLM triton backend
MIT License
122 stars 22 forks source link

all option is same as openai? #48

Open dongs0104 opened 2 weeks ago

dongs0104 commented 2 weeks ago

when i use n option is different as openai.

when i use n it turn to use beam search.

npuichigo commented 2 weeks ago

Sorry, I think I misunderstand the n in trtllm, where I have expected multiple beam would be returned. According to this thread, https://github.com/triton-inference-server/tensorrtllm_backend/issues/499, maybe I need to make multiple requests to return n samples.

npuichigo commented 2 weeks ago

By the way, do you know what choice.index would be like when using stream along with n>1?

dongs0104 commented 2 weeks ago

thanks for your hard work, i will run it on openai, than attach result :)

dongs0104 commented 2 weeks ago

By the way, do you know what choice.index would be like when using stream along with n>1?

when i use Open AI API it return n == 2 and stream=True

data: {'choices':[{"delta":{"role":"assistant"}, "finish_reason":null, "index":0}]}
data: {'choices':[{"delta":{"role":"assistant"}, "finish_reason":null, "index":1}]}
data: {'choices':[{"delta":{"content":"A"}, "finish_reason":null, "index":0}]}
data: {'choices':[{"delta":{"content":"A"}, "finish_reason":null, "index":0}]}
...
data: {'choices':[{"delta":{"content":"B"}, "finish_reason":null, "index":1}]}
data: {'choices':[{"delta":{"content":"B"}, "finish_reason":null, "index":1}]}
...
data: [DONE]