aws-samples / amazon-transcribe-live-meeting-assistant

MIT No Attribution
56 stars 12 forks source link

Intermittent: "Default message - you'll see this only if the lambda hook failed" response from meeting assistant #38

Open rstrahan opened 3 months ago

rstrahan commented 3 months ago

Describe the bug

Sometime the meeting assistant responds: "Default message - you'll see this only if the lambda hook failed"

On investigation, the root cause is reflected by this error in the CW Logs for function QNA-lma6-BedrockKB-LambdaHook:

Amazon Bedrock KB Exception:  An error occurred (ValidationException) when calling the RetrieveAndGenerate operation: 1 validation error detected: Value at 'retrieveAndGenerateConfiguration.knowledgeBaseConfiguration.generationConfiguration.promptTemplate.textPromptTemplate' failed to satisfy constraint: Member must have length less than or equal to 4000

which is caused by the prompt size being too long for the Bedrock API call.. due to too much transcript text being included in the prompt.

The transcript by default is truncated to last 12 turns - Qnabot setting LLM_CHAT_HISTORY_MAX_MESSAGES - setting this to a lower value resolves the issue.. but we should defend against it happening at all by checking and truncating prompt length regardless of MAX_MESSAGES (since individual messages can be long)

To Reproduce Steps to reproduce the behavior:

Generate a transcript with long turns.. eg where one person is presenting..

After several long turns, invoke the meeting assistant via either 'OK Assistant' voice prompt or bot button..

It will respond with "Default message - you'll see this only if the lambda hook failed"

Expected behavior

It should not fail.

Screenshots

image

Additional context Add any other context about the problem here.

rstrahan commented 1 month ago

Reduced default value of LLM_CHAT_HISTORY_MAX_MESSAGES to 10 as an interim measure

atjohns commented 2 weeks ago

One thing we might could try is producing a summary of the transcript so far and sending that, seems potentially more useful than a set number of messages. But that runs into response time issues as well as we'd be making more calls to the LLMs.