opea-project / GenAIExamples

Generative AI Examples is a collection of GenAI examples such as ChatQnA, Copilot, which illustrate the pipeline capabilities of the Open Platform for Enterprise AI (OPEA) project.
https://opea.dev
Apache License 2.0
285 stars 195 forks source link

[Bug]Max output tokens from an LLM should be configurable. #1131

Open mkbhanda opened 2 weeks ago

mkbhanda commented 2 weeks ago

Priority

P2-High

OS type

Ubuntu

Hardware type

Xeon-SPR

Installation method

Deploy method

Running nodes

Single Node

What's the version?

Development branch, post V1.0.

Description

The ChatQnA example appears to be using the max_tokens parameter to control the number of output llm tokens, but it is not getting passed along if the re-ranker component is removed from the pipeline. Perhaps we have a bug in Mega or in the token used. The OpenAI openAPI uses max_completion_tokens and we perhaps have migrated to this incompletely. We may need to also check GenAIComps.

This was noticed by @leslieluyu.

Reproduce steps

Run ChatQnA without re-ranker and try to control the maximum output tokens by passing in some value.

Raw log

No response

lvliang-intel commented 3 days ago

@yao531441, Please help to check this issue. The maximum output tokens setting should not be related with reranking.

yao531441 commented 2 days ago

@mkbhanda @leslieluyu Can you provide more detailed steps to reproduce? Our test using Docker to start chatqna without reranking are normal. image