aws-cloudformation / cloudformation-coverage-roadmap

The AWS CloudFormation Public Coverage Roadmap
https://aws.amazon.com/cloudformation/
Creative Commons Attribution Share Alike 4.0 International
1.1k stars 53 forks source link

New Cloudformation resource - [AWS::OpenSearchServerless::Index] #2043

Open mattyboy84 opened 1 month ago

mattyboy84 commented 1 month ago

Resource Name

OpenSearchServerless - [AWS::OpenSearchServerless::Index]

Details

I'm proposing the addition of a new AWS CloudFormation resource type, AWS::OpenSearchServerless::Index, to facilitate the creation of indexes within OpenSearch Serverless collections.

Background:

OpenSearch Serverless currently supports the creation of collections using the AWS::OpenSearchServerless::Collection resource. However, there is no native support for creating indexes within these collections.

Necessity:

The absence of a dedicated resource for creating indexes poses challenges, especially when creating resources like Bedrock KnowledgeBases (AWS::Bedrock::KnowledgeBase). A Bedrock KnowledgeBase requires a pre-existing index, forcing the use of custom CloudFormation resources, which adds complexity and limits the ability to manage resources efficiently within a single CloudFormation template.

My current implementation of Open Search Indexs for Bedrock roughly follows:

  opensearchServerlessVectorsearchCollection:
    Type: AWS::OpenSearchServerless::Collection
    DependsOn:
      - EncryptionPolicy
    Properties:
      Name: !Sub "${AWS::StackName}-vector"
      Type: VECTORSEARCH
      StandbyReplicas: DISABLED
  KnowledgeBaseWithAoss:
    Type: AWS::Bedrock::KnowledgeBase
    Properties:
      StorageConfiguration:
        Type: "OPENSEARCH_SERVERLESS"
        OpensearchServerlessConfiguration:
          VectorIndexName: !GetAtt opensearchServerlessVectorsearchCollectionIndex.Id
          FieldMapping:
            VectorField: "bedrock-knowledge-base-default-vector"
            TextField: "AMAZON_BEDROCK_TEXT_CHUNK"
            MetadataField: "AMAZON_BEDROCK_METADATA"
  opensearchServerlessVectorsearchCollectionIndex:
    Type: Custom::CollectionIndex # This would be AWS::OpenSearchServerless::Index
    Properties:
      ServiceToken: !GetAtt createOpensearchIndexFunction.Arn
      CollectionId: !Ref opensearchServerlessVectorsearchCollection
      IndexName: bedrock-knowledge-base-default-index
      VectorFields:
        - VectorFieldName: bedrock-knowledge-base-default-vector
          Engine: faiss
          Dimensions: 1536
          DistanceMetric: Euclidiean
          M: 16
          ef_construction: 512
      Metadata:
        - MappingField: AMAZON_BEDROCK_METADATA
          DataType: text
          Filterable: False
        - MappingField: AMAZON_BEDROCK_TEXT_CHUNK
          DataType: text
          Filterable: True

The custom resource to act as AWS::OpenSearchServerless::Index translates the Properties and preforms a PUT request against the collection endpoint to create the index

{
  "settings": {
      "index": {
          "knn": true,
          "knn.algo_param.ef_search": 512
      }
  },
  "mappings": {
      "properties": {
          "AMAZON_BEDROCK_METADATA": {
              "type": "text",
              "index": "false"
          },
          "AMAZON_BEDROCK_TEXT_CHUNK": {
              "type": "text",
              "index": "true"
          },
          "bedrock-knowledge-base-default-vector": {
              "type": "knn_vector",
              "dimension": 1536,
              "method": {
                  "name": "hnsw",
                  "engine": "faiss",
                  "parameters": {
                      "m": 16,
                      "ef_construction": 512
                  },
                  "space_type": "l2"
              }
          }
      }
  }
}