Open mkbhanda opened 2 weeks ago
@yao531441, Please help to check this issue. The maximum output tokens setting should not be related with reranking.
@mkbhanda @leslieluyu Can you provide more detailed steps to reproduce? Our test using Docker to start chatqna without reranking are normal.
Priority
P2-High
OS type
Ubuntu
Hardware type
Xeon-SPR
Installation method
Deploy method
Running nodes
Single Node
What's the version?
Development branch, post V1.0.
Description
The ChatQnA example appears to be using the max_tokens parameter to control the number of output llm tokens, but it is not getting passed along if the re-ranker component is removed from the pipeline. Perhaps we have a bug in Mega or in the token used. The OpenAI openAPI uses max_completion_tokens and we perhaps have migrated to this incompletely. We may need to also check GenAIComps.
This was noticed by @leslieluyu.
Reproduce steps
Run ChatQnA without re-ranker and try to control the maximum output tokens by passing in some value.
Raw log
No response