awslabs / generative-ai-cdk-constructs

AWS Generative AI CDK Constructs are sample implementations of AWS CDK for common generative AI patterns.
https://awslabs.github.io/generative-ai-cdk-constructs/
Apache License 2.0
330 stars 49 forks source link

@cdklabs/generative-ai-cdk-constructs: CDK considers a subsequent deploy of a knowledge base as a creation of a new knowledge base even though it is a simple NPM package upgrade #690

Closed malikalimoekhamedov closed 6 days ago

malikalimoekhamedov commented 1 week ago

Describe the bug

I have a Bedrock knowledge base I construct with CDK like so:

const vectorStore = new opensearchserverless.VectorCollection(
  this,
  'VectorCollectionName',
  {
    collectionName: 'collection-name',
    standbyReplicas:
      process.env.ENV === 'prd'
        ? opensearchserverless.VectorCollectionStandbyReplicas.ENABLED
        : opensearchserverless.VectorCollectionStandbyReplicas.DISABLED,
  }
);

new bedrock.KnowledgeBase(this, 'MyKnowledgeBaseName', {
  embeddingsModel: bedrock.BedrockFoundationModel.TITAN_EMBED_TEXT_V2_1024,
  instruction: 'Some instructions.',
  description: 'Knowledge base description.',
  vectorStore,
});

Without changing anything to this setting and simply upgrading to the latest npm package, I now see the following error at deployment.

Resource handler returned message: "KnowledgeBase with name [MyKnowledgeBaseName11111111] already exists. Status Code: 409, Request ID: etc, etc.

Expected Behavior

One should be able to deploy a stack as per usual.

Current Behavior

Error: The stack named STACK_NAME failed to deploy: UPDATE_ROLLBACK_COMPLETE: Resource handler returned message: "KnowledgeBase with name MyKnowledgeBase111 already exists. (Service: BedrockAgent, Status Code: 409, Request ID: etc, etc.

Reproduction Steps

A simple redeployment of the stack starts causing problems. The only thing I did was to upgrade the library.

Possible Solution

Roll back the library version? I'll try that in the meantime.

Additional Information/Context

No response

CDK CLI Version

2.156.0 (build 2966832)

Framework Version

0.1.264

Node.js Version

v22.8.0

OS

MacOS Sonoma 14.6.1

Language

Typescript

Language Version

5.5.4

Region experiencing the issue

us-west-2

Code modification

No

Other information

No response

Service quota

malikalimoekhamedov commented 1 week ago

I ran the test and can certify the deployment goes through seamlessly with v0.1.262 of @cdklabs/generative-ai-cdk-constructs.

aws-rafams commented 1 week ago

Hi Malik,

I've analyzed the issue, and my hypothesis is that it's related to the changes in the CloudFormation definition for Knowledge Bases and the subsequent changes in the CDK. Previously, when you created a Knowledge Base with the Titan embedding model, it used an embedding size of 1024 by default. However, now when you use Titan embeddings, you need to specify the desired embedding dimensions (1024, 512, or 256).

Since the construct now defines this property, and CloudFormation marks this property as 'Requires Replacement', it attempts to perform an update, leading to the observed behaviour. This issue could potentially be resolved by using the addDeletionOverride escape hatch.

I will continue to investigate and confirm my hypothesis through further testing. Once I have more conclusive findings, I'll provide an update.

aws-rafams commented 1 week ago

Hi Malik, I confirm my hypothesis. After updating the library and performing a cdk diff you get: Screenshot 2024-09-12 at 10 36 42

aws-rafams commented 6 days ago

You can use the following escape hatch to solve the issue:

import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as opensearchserverless from '@cdklabs/generative-ai-cdk-constructs/lib/cdk-lib/opensearchserverless';
import { BedrockFoundationModel, KnowledgeBase } from '@cdklabs/generative-ai-cdk-constructs/lib/cdk-lib/bedrock';
import { CfnKnowledgeBase } from 'aws-cdk-lib/aws-bedrock';

export class TestIssue690Stack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const vectorStore = new opensearchserverless.VectorCollection(
      this,
      'VectorCollectionName',
      {
        collectionName: 'collection-name',
        standbyReplicas:
          process.env.ENV === 'prd'
            ? opensearchserverless.VectorCollectionStandbyReplicas.ENABLED
            : opensearchserverless.VectorCollectionStandbyReplicas.DISABLED,
      }
    );

    const kb = new KnowledgeBase(this, 'MyKnowledgeBaseName', {
      embeddingsModel: BedrockFoundationModel.TITAN_EMBED_TEXT_V2_1024,
      instruction: 'Some instructions.',
      description: 'Knowledge base description.',
      vectorStore,
    });

    // Backward compatibility hatch
    const cfnKbs = kb.node.findAll().filter((s) => s instanceof CfnKnowledgeBase) as CfnKnowledgeBase[]
    cfnKbs.forEach((cfnKb) => {
      cfnKb.addDeletionOverride("Properties.KnowledgeBaseConfiguration.VectorKnowledgeBaseConfiguration.EmbeddingModelConfiguration")
    })
  }
}

When adding the hatch, the cdk diff then becomes: Screenshot 2024-09-12 at 10 46 14

malikalimoekhamedov commented 6 days ago

You can use the following escape hatch to solve the issue:

import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as opensearchserverless from '@cdklabs/generative-ai-cdk-constructs/lib/cdk-lib/opensearchserverless';
import { BedrockFoundationModel, KnowledgeBase } from '@cdklabs/generative-ai-cdk-constructs/lib/cdk-lib/bedrock';
import { CfnKnowledgeBase } from 'aws-cdk-lib/aws-bedrock';

export class TestIssue690Stack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const vectorStore = new opensearchserverless.VectorCollection(
      this,
      'VectorCollectionName',
      {
        collectionName: 'collection-name',
        standbyReplicas:
          process.env.ENV === 'prd'
            ? opensearchserverless.VectorCollectionStandbyReplicas.ENABLED
            : opensearchserverless.VectorCollectionStandbyReplicas.DISABLED,
      }
    );

    const kb = new KnowledgeBase(this, 'MyKnowledgeBaseName', {
      embeddingsModel: BedrockFoundationModel.TITAN_EMBED_TEXT_V2_1024,
      instruction: 'Some instructions.',
      description: 'Knowledge base description.',
      vectorStore,
    });

    // Backward compatibility hatch
    const cfnKbs = kb.node.findAll().filter((s) => s instanceof CfnKnowledgeBase) as CfnKnowledgeBase[]
    cfnKbs.forEach((cfnKb) => {
      cfnKb.addDeletionOverride("Properties.KnowledgeBaseConfiguration.VectorKnowledgeBaseConfiguration.EmbeddingModelConfiguration")
    })
  }
}

When adding the hatch, the cdk diff then becomes: Screenshot 2024-09-12 at 10 46 14

@aws-rafams, you are a hero!!!