boto / boto3

AWS SDK for Python
https://aws.amazon.com/sdk-for-python/
Apache License 2.0
8.98k stars 1.86k forks source link

Amazon Bedrock Create Knowledge Base Storage Configuration Bug #3990

Closed cxmiller23 closed 8 months ago

cxmiller23 commented 8 months ago

Describe the bug

I'm trying to create an Amazon Bedrock knowledge base from a Python script. When I call the client.create_knowledge_base() function, it's showing an error around the storage config that I'm passing. I've double and triple-checked what would be off and I'm not seeing anything yet. I've copied a working storage config value from a Knowledge Base manually created in the console and it's identical based on the client.get_knowledge_base() function results.

    client = boto3.client("bedrock-agent")
    response = client.create_knowledge_base(
        name="name",
        description="description",
        roleArn="role-with-the-same-values-of-the-aws-created-execution-role",
        knowledgeBaseConfiguration={
            "type": "VECTOR",
            "vectorKnowledgeBaseConfiguration": {
                "embeddingModelArn": "arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v1"
            },
        },
        storageConfiguration={
            "type": "OPENSEARCH_SERVERLESS",
            "opensearchServerlessConfiguration": {
                "collectionArn": "my-custom-collection-arn-that-has-the-same-index-as-below",
                "vectorIndexName": "bedrock-knowledge-base-default-index",
                "fieldMapping": {
                    "vectorField": "bedrock-knowledge-base-default-vector",
                    "textField": "AMAZON_BEDROCK_TEXT_CHUNK",
                    "metadataField": "AMAZON_BEDROCK_METADATA",
                },
            },
        },
    )

Error message

botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the CreateKnowledgeBase operation: The knowledge base storage configuration provided is invalid... Request failed: [security_exception] all shards failed

Expected Behavior

I'd expect the knowledge base to be created successfully since from my understanding, the storage config that's being passed is valid. Otherwise, I'd expect the error message to help point out which part of the storage config is invalid to help troubleshoot the issue further.

Current Behavior

2024-01-15 16:28:41.138 | INFO     | __main__:main:299 - Creating Knowledge Base...
2024-01-15 16:28:41,138 botocore.hooks [DEBUG] Event before-parameter-build.bedrock-agent.CreateKnowledgeBase: calling handler <function generate_idempotent_uuid at 0x10450b1a0>
2024-01-15 16:28:41,138 botocore.handlers [DEBUG] injecting idempotency token (b177-4917-aefd) into param 'clientToken'.
2024-01-15 16:28:41,138 botocore.regions [DEBUG] Calling endpoint provider with parameters: {'Region': 'us-east-1', 'UseDualStack': False, 'UseFIPS': False}
2024-01-15 16:28:41,139 botocore.regions [DEBUG] Endpoint provider result: https://bedrock-agent.us-east-1.amazonaws.com
2024-01-15 16:28:41,139 botocore.hooks [DEBUG] Event before-call.bedrock-agent.CreateKnowledgeBase: calling handler <function add_recursion_detection_header at 0x104509a80>
2024-01-15 16:28:41,139 botocore.hooks [DEBUG] Event before-call.bedrock-agent.CreateKnowledgeBase: calling handler <function inject_api_version_header_if_needed at 0x104530cc0>
2024-01-15 16:28:41,139 botocore.endpoint [DEBUG] Making request for OperationModel(name=CreateKnowledgeBase) with params: {'url_path': '/knowledgebases/', 'query_string': {}, 'method': 'PUT', 'headers': {'Content-Type': 'application/json', 'User-Agent': 'Boto3/1.34.3 md/Botocore#1.34.3 ua/2.0 os/macos#23.1.0 md/arch#arm64 lang/python#3.11.7 md/pyimpl#CPython cfg/retry-mode#legacy Botocore/1.34.3'}, 'body': b'{"name": "demo-rag", "description": "Demo knowledge base for RAG", "roleArn": "arn:aws:iam::x:role/AmazonBedrockExecutionRoleForKnowledgeBase_Default", "knowledgeBaseConfiguration": {"type": "VECTOR", "vectorKnowledgeBaseConfiguration": {"embeddingModelArn": "arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v1"}}, "storageConfiguration": {"type": "OPENSEARCH_SERVERLESS", "opensearchServerlessConfiguration": {"collectionArn": "arn:aws:aoss:us-east-1:x:collection/pfc4rrjmdklc5dkg26yg", "vectorIndexName": "bedrock-knowledge-base-default-index", "fieldMapping": {"vectorField": "bedrock-knowledge-base-default-vector", "textField": "AMAZON_BEDROCK_TEXT_CHUNK", "metadataField": "AMAZON_BEDROCK_METADATA"}}}, "clientToken": "b177-4917-aefd"}', 'url': 'https://bedrock-agent.us-east-1.amazonaws.com/knowledgebases/', 'context': {'client_region': 'us-east-1', 'client_config': <botocore.config.Config object at 0x1054ff590>, 'has_streaming_input': False, 'auth_type': None}}
2024-01-15 16:28:41,140 botocore.hooks [DEBUG] Event request-created.bedrock-agent.CreateKnowledgeBase: calling handler <bound method RequestSigner.handler of <botocore.signers.RequestSigner object at 0x1054ff550>>
2024-01-15 16:28:41,140 botocore.hooks [DEBUG] Event choose-signer.bedrock-agent.CreateKnowledgeBase: calling handler <function set_operation_specific_signer at 0x10450b060>
2024-01-15 16:28:41,140 botocore.auth [DEBUG] Calculating signature using v4 auth.
2024-01-15 16:28:41,140 botocore.auth [DEBUG] CanonicalRequest:
PUT
/knowledgebases/

content-type:application/json
host:bedrock-agent.us-east-1.amazonaws.com
x-amz-date:20240115T222841Z

content-type;host;x-amz-date
2024-01-15 16:28:41,140 botocore.auth [DEBUG] StringToSign:
AWS4-HMAC-SHA256
20240115T222841Z
20240115/us-east-1/bedrock/aws4_request
2024-01-15 16:28:41,140 botocore.hooks [DEBUG] Event request-created.bedrock-agent.CreateKnowledgeBase: calling handler <function add_retry_headers at 0x104531440>
2024-01-15 16:28:41,140 botocore.endpoint [DEBUG] Sending http request: <AWSPreparedRequest stream_output=False, method=PUT, url=https://bedrock-agent.us-east-1.amazonaws.com/knowledgebases/, headers={'Content-Type': b'application/json', 'User-Agent': b'Boto3/1.34.3 md/Botocore#1.34.3 ua/2.0 os/macos#23.1.0 md/arch#arm64 lang/python#3.11.7 md/pyimpl#CPython cfg/retry-mode#legacy Botocore/1.34.3', 'X-Amz-Date': b'20240115T222841Z', 'Authorization': b'AWS4-HMAC-SHA256 Credential=x/20240115/us-east-1/bedrock/aws4_request, SignedHeaders=content-type;host;x-amz-date, Signature=x', 'amz-sdk-invocation-id': b'e6c7-493b-8cfd', 'amz-sdk-request': b'attempt=1', 'Content-Length': '805'}>
2024-01-15 16:28:41,140 botocore.httpsession [DEBUG] Certificate path: /opt/homebrew/lib/python3.11/site-packages/certifi/cacert.pem
2024-01-15 16:28:41,595 botocore.parsers [DEBUG] Response headers: {'Date': 'Mon, 15 Jan 2024 22:28:41 GMT', 'Content-Type': 'application/json', 'Content-Length': '132', 'Connection': 'keep-alive', 'x-amzn-RequestId': '5837-429f-9aca', 'x-amzn-ErrorType': 'ValidationException', 'x-amz-apigw-id': 'RmiwAH3VIAMEjgA=', 'X-Amzn-Trace-Id': 'Root=1-65a5b199-696fe7bb515bc1b37ab7f99c'}
2024-01-15 16:28:41,595 botocore.parsers [DEBUG] Response body:
b'{"message":"The knowledge base storage configuration provided is invalid... Request failed: [security_exception] all shards failed"}'
2024-01-15 16:28:41,596 botocore.parsers [DEBUG] Response headers: {'Date': 'Mon, 15 Jan 2024 22:28:41 GMT', 'Content-Type': 'application/json', 'Content-Length': '132', 'Connection': 'keep-alive', 'x-amzn-RequestId': '05255aa0-5837-429f-9aca-466800132436', 'x-amzn-ErrorType': 'ValidationException', 'x-amz-apigw-id': 'RmiwAH3VIAMEjgA=', 'X-Amzn-Trace-Id': 'Root=1-65a5b199-696fe7bb515bc1b37ab7f99c'}
2024-01-15 16:28:41,596 botocore.parsers [DEBUG] Response body:
b'{"message":"The knowledge base storage configuration provided is invalid... Request failed: [security_exception] all shards failed"}'
2024-01-15 16:28:41,596 botocore.hooks [DEBUG] Event needs-retry.bedrock-agent.CreateKnowledgeBase: calling handler <botocore.retryhandler.RetryHandler object at 0x1054da690>
2024-01-15 16:28:41,596 botocore.retryhandler [DEBUG] No retry needed.
Traceback (most recent call last):
  File "/.../create_knowledge_base.py", line 312, in <module>
    main()
  File "/...create_knowledge_base.py", line 303, in main
    kb_id = create_knowledge_base(collection_arn)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/...create_knowledge_base.py", line 225, in create_knowledge_base
    response = bedrock_client.create_knowledge_base(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/botocore/client.py", line 553, in _api_call
    return self._make_api_call(operation_name, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/botocore/client.py", line 1009, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the CreateKnowledgeBase operation: The knowledge base storage configuration provided is invalid... Request failed: [security_exception] all shards failed

Reproduction Steps

Since the Bedrock services are still actively being developed I don't have a clean quick and simple set of steps to replicate so I'll do my best.

  1. Create a working Knowledge base manually in the console
    1. Select AWS to create a new IAM role
    2. Select AWS to create a new OpenSearch Serverless collection
    3. Create an AWS S3 bucket to upload files to
    4. Confirm and create a knowledge base
    5. View IAM console and duplicate or modify IAM role that was created to allow for * OpenSearch collections and copy the IAM role ARN
    6. View OpenSearch Serverless Collections and copy Collection ARN
  2. Create a local python script and copy the below code - Update the constants and run the script
  3. Hopefully, see the same storage config error
import boto3

client = boto3.client("bedrock-agent")

# Update the following with your own values
KB_ROLE_ARN = "arn:aws:iam::<account>:role/<name>"
OS_COLLECTION_ARN = "arn:aws:aoss:<region>:<account>:collection/<id>"

KB_NAME = "bedrock-knowledge-base-test"
STORAGE_PREFIX = "bedrock-knowledge-base-default"

response = client.create_knowledge_base(
    name=KB_NAME,
    description=KB_NAME,
    roleArn=KB_ROLE_ARN,
    knowledgeBaseConfiguration={
        "type": "VECTOR",
        "vectorKnowledgeBaseConfiguration": {
            "embeddingModelArn": "arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v1"
        },
    },
    storageConfiguration={
        "type": "OPENSEARCH_SERVERLESS",
        "opensearchServerlessConfiguration": {
            "collectionArn": OS_COLLECTION_ARN,
            "vectorIndexName": f"{STORAGE_PREFIX}-index",
            "fieldMapping": {
                "vectorField": "{STORAGE_PREFIX}-vector",
                "textField": "AMAZON_BEDROCK_TEXT_CHUNK",
                "metadataField": "AMAZON_BEDROCK_METADATA",
            },
        },
    },
)

print(f"Knowledge base response: {response}")

Possible Solution

No response

Additional Information/Context

No response

SDK version used

1.34.3

Environment details (OS name and version, etc.)

Mac OS Sonoma 14.1.1

cxmiller23 commented 8 months ago

Update - I re-ran the job today and it worked as expected. I didn't change anything locally. I cleaned up resources in the console before re-running and I'm not sure if somehow that would make a difference (wouldn't expect it to). I'm closing this since the error seems to be gone.

github-actions[bot] commented 8 months ago

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.