boto / boto3

AWS SDK for Python
https://aws.amazon.com/sdk-for-python/
Apache License 2.0
8.81k stars 1.84k forks source link

bedrock_agent_runtime APIs are not returning source metadata #4124

Closed adoyon23 closed 1 month ago

adoyon23 commented 1 month ago

Describe the bug

When I call the invoke_agent, retrieve, or retrieve_and_generate APIs for the bedrock_agent_runtime, the response never contains the metadata keys associated with the source files. I know that the metadata is set up correctly for the source, because I see it show up in the vector DB indexes, and if I test the knowledge base directly in the console and view the trace, it shows the correct source metadata.

However, when I call any of the APIs mentioned above referencing the same knowledge base, the response never contains the source metadata information.

Expected Behavior

The retrievedReferences key in the response from an invoke_agent call to the bedrock_agent_runtime should contain a metadata key whose values match the JSON key values set in the corresponding source .metadata.json file.

Current Behavior

There is no metadata key returned by any of the invoke_agent, retrieve, or retrieve_and_generate APIs.

Reproduction Steps

Call the invoke_agent API similar to the code below with redacted info:

def get_agent_response(response):
    if "completion" not in response:
        return f"No completion found in response: {response}"
    for event in response["completion"]:
        # Extract the traces
        if "chunk" in event:
            # Extract the bytes from the chunk
            chunk_bytes = event["chunk"]["bytes"]

            # Convert bytes to string, assuming UTF-8 encoding
            chunk_text = chunk_bytes.decode("utf-8")

            # Print the response text
            print("Response from the agent:", chunk_text)
            # If there are citations with more detailed responses, print them
            reference_text = ""
            source_file_list = []
            if (
                "attribution" in event["chunk"]
                and "citations" in event["chunk"]["attribution"]
            ):
                for citation in event["chunk"]["attribution"]["citations"]:
                    if (
                        "generatedResponsePart" in citation
                        and "textResponsePart" in citation["generatedResponsePart"]
                    ):
                        text_part = citation["generatedResponsePart"][
                            "textResponsePart"
                        ]["text"]
                        print("Detailed response part:", text_part)
                    source_file_list = []
                    metadata_values = []
                    if "retrievedReferences" in citation:
                        for reference in citation["retrievedReferences"]:
                            print("Reference")
                            print(reference)
                            if (
                                "content" in reference
                                and "text" in reference["content"]
                            ):
                                reference_text = reference["content"]["text"]
                                print("Reference text:", reference_text)
                            if "location" in reference:
                                source_file = reference["location"]["s3Location"]["uri"]
                                source_file_list.append(source_file)
                            if "metadata" in reference:
                                print("Found metadata")
                                print(reference["metadata"])
                                print(reference["metadata"]["string"])
                                metadata_values = reference["metadata"]["string"]
                    print(f"source_file_list: {source_file_list}")
                    print(f"meta_data_values_list: {metadata_values}")

    client = boto3.client("bedrock-agent-runtime", region_name="us-west-2")

    response = client.invoke_agent(
         agentAliasId='***',
         agentId='***',
         sessionId="session1",
         inputText="inputText",
        enableTrace=True
    )

  chunk_text, reference_text, source_file_list = get_agent_response(response)

Possible Solution

No response

Additional Information/Context

No response

SDK version used

boto3-1.34.99

Environment details (OS name and version, etc.)

Mac OS 14.4.1 (23E224), Python 3.12 runtime for AWS Lambda

tim-finnigan commented 1 month ago

Hi @adoyon23 thanks for reaching out. The Agents for Amazon Bedrock Runtime APIs are called when using the corresponding Boto3 commands. Therefore we would need to escalate this issue to the service team to investigate issues as to why the expected metadata is not being returned. I created a new issue to track this in our cross-SDK repository going forward, since service APIs like this are used across AWS SDKs: https://github.com/aws/aws-sdk/issues/746. Please refer to that issue for updates going forward.

github-actions[bot] commented 1 month ago

This issue is now closed. Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.