boto / boto3

AWS SDK for Python
https://aws.amazon.com/sdk-for-python/
Apache License 2.0
9.07k stars 1.87k forks source link

bedrock_agent_runtime APIs are not returning source metadata #4352

Open vbloise3 opened 4 days ago

vbloise3 commented 4 days ago

Describe the bug

When I call the invoke_agent, retrieve, or retrieve_and_generate APIs for the bedrock_agent_runtime, the response never contains the metadata keys associated with the source files. I know that the metadata is set up correctly for the source, because I see it show up in the vector DB indexes, and if I test the knowledge base directly in the console and view the trace, it shows the correct source metadata.

However, when I call any of the APIs mentioned above referencing the same knowledge base, the response never contains the source metadata information.

Regression Issue

Expected Behavior

The retrievedReferences key in the response from an invoke_agent call to the bedrock_agent_runtime should contain a metadata key whose values match the JSON key values set in the corresponding source .metadata.json file.

Current Behavior

There is no metadata key returned by any of the invoke_agent, retrieve, or retrieve_and_generate APIs.

Reproduction Steps

try: response = bedrock_agent_runtime.retrieve_and_generate( input={'text': query}, retrieveAndGenerateConfiguration={ 'type': 'KNOWLEDGE_BASE', 'knowledgeBaseConfiguration': { 'generationConfiguration': { 'promptTemplate': { 'textPromptTemplate': 'Use this context to answer the question at the end. Give a very detailed, long answer. Use a friendly, conversational tone.\n\nContext: $search_results$\n\nQuestion: $input$' } }, 'knowledgeBaseId': KB_ID, 'modelArn': f"arn:aws:bedrock:us-east-1:904262394592:inference-profile/us.anthropic.claude-3-5-sonnet-20241022-v2:0", 'retrievalConfiguration': { 'vectorSearchConfiguration': { 'numberOfResults': 5 } } } } ) print(f"\n\n the returned answer: {response} \n\n") generated_text = response['output']['text']

Extract the retrieved sources

    retrieved_sources = []
    if 'retrievalResults' in response:
        for result in response['retrievalResults']:
            if 'location' in result:
                location = result['location']
                if 'uri' in location:
                    retrieved_sources.append({
                        'uri': location['uri'],
                        'score': result.get('score', None),
                        'content': result.get('content', {}).get('text', '')
                    })
        print("f\n\n the metadata: {retrieved_sources} \n\n")
    return generated_text, retrieved_sources

except ClientError as e:
    logger.error(f"Error in answer_query_tool: {e}")
    return f"An error occurred: {str(e)}"

Possible Solution

Return the metadata in the response to the bedrock_agent_runtime.retrieve_and_generate API call.

Additional Information/Context

No response

SDK version used

boto3 1.35.63

Environment details (OS name and version, etc.)

Mac OS 14.7.1, Python 3.11.4

adev-code commented 9 hours ago

Hello @vbloise3, thank you for reaching out. For further look, could you please provide the full debug logs by adding the line boto3.set_stream_logger('') and redacting any sensitive information. Also, if you can also include more information for the code you have and bedrock replication. Thank you.