(bedrock): (Add TITAN_EMBED_TEXT_V2)

yuya-tajima commented 3 months ago

Describe the feature

Titan Text Embeddings v2 is now available in the AWS management console.

Use Case

When instantiating the KnowledgeBase class, set the embeddingsMode to TITAN_EMBED_TEXT_V2.

Currently, however, three patterns (1024, 512, and 256) can be specified as vectorDimensions, so it may be necessary to define each pattern as TITAN_EMBED_TEXT_V2_1024.

Or, assuming that arbitrary vectorDimensions can be specified in the future, can they be specified as parameters? We should also consider such a possibility.

Proposed Solution

No response

Other Information

No response

Acknowledgements

[x] I may be able to implement this feature request
[ ] This feature might incur a breaking change

krokoko commented 3 months ago

Hi @yuya-tajima , thank you for this feature request ! The model is indeed available, however it seems the CDK L1 constructs do not allow you to pass the specific vectorDimensions but only the model_arn. If I'm not mistaken, the model_arn is the same regardless of the vectorDimensions, thus it is not possible to configure it at the moment through IaC. When configuring the vectorDimensions will be possible, we will add it to the library.

yuya-tajima commented 3 months ago

Hi, @krokoko, thank you for your reply, I understand that only EmbeddingModelArn can be specified as a KnowledgeBase VirtualKnowledgeBaseConfiguration.

Currently, it is possible to specify the EmbeddingsModel property as a constant as shown below, so I would be happy if it is possible to specify this in TITAN_EMBED_TEXT_V2 as well.

const kb = new bedrock.KnowledgeBase(this, 'KnowledgeBase', {
  embeddingsModel: bedrock.BedrockFoundationModel.TITAN_EMBED_TEXT_V1,
  instruction: 'Use this knowledge base to answer questions about books.
    'It contains the full text of novels.', }
});

In this case, it would be better to define constants specifying vectorDimensions for each of 1024, 512, and 256, right?

const kb = new bedrock.KnowledgeBase(this, 'KnowledgeBase', {
  embeddingsModel: bedrock.BedrockFoundationModel.TITAN_EMBED_TEXT_V2_1024,
  instruction: 'Use this knowledge base to answer questions about books.
    'It contains the full text of novels.', }
});

I believe the following code is relevant.

https://github.com/awslabs/generative-ai-cdk-constructs/blob/main/src/cdk-lib/bedrock/models.ts

krokoko commented 3 months ago

Hi @yuya-tajima , in the code you mentioned here: https://github.com/awslabs/generative-ai-cdk-constructs/blob/main/src/cdk-lib/bedrock/models.ts we do specify the vectorDimensions to use it with associated constructs like OpenSearch Serverless L2 or Aurora when creating a vector store. That vectorDimensions value is not used in the creation of the knowledge base since this parameter cannot be specified in the underlying L1 construct.

Basically, when you create a Bedrock Knowledge Base with our construct (L2) using the following code snippet:

const kb = new bedrock.KnowledgeBase(this, 'KnowledgeBase', {
  embeddingsModel: bedrock.BedrockFoundationModel. TITAN_EMBED_TEXT_V1,
  instruction: 'Use this knowledge base to answer questions about books.
    'It contains the full text of novels.', }
});

Behind the scenes it will use the L1 Bedrock CDK Knowledge Base construct which, as you mentioned, takes only the embeddings model arn as parameter:

vectorKnowledgeBaseConfiguration: {
      embeddingModelArn: 'embeddingModelArn',
    },

As far as I know, all 3 versions of Titan Text Embeddings V2 (1024, 512, and 256) have the same model arn, so we cannot programmatically specify which vectorDimensions to use. When this parameter will be exposed (or if different model arns are available), we will add as you mentioned the following models:

bedrock.BedrockFoundationModel.TITAN_EMBED_TEXT_V2_256
bedrock.BedrockFoundationModel.TITAN_EMBED_TEXT_V2_512
bedrock.BedrockFoundationModel.TITAN_EMBED_TEXT_V2_1024

I am interacting with the CloudFormation team and will update this ticket as soon as I have more information. FYI there is a known issue (CloudFormation side) where deploying a knowledge base with titan text embedding v2 triggers a deployment error (see https://github.com/awslabs/generative-ai-cdk-constructs/pull/495#issuecomment-2151069502)

Note: If a new model is available and not supported yet by our library, you can instantiate it yourself in your code, for instance:

const myModel = new bedrock.BedrockFoundationModel('amazon.newmodel', {
      supportsKnowledgeBase: true,
      vectorDimensions: 1024
    });

where amazon.newmodel represents the model_id of the model as described in the documentation (here for Knowledge Bases, here for Agents)

yuya-tajima commented 3 months ago

Hi @krokoko,

As far as I know, all 3 versions of Titan Text Embeddings V2 (1024, 512, and 256) have the same model arn, so we cannot programmatically specify which vectorDimensions to use.

I understand it well. Now I'll wait for the changes on the CloudFormation side.

Thank you for taking the time to reply.

amoghgaikwad commented 3 months ago

currently you can add in TITAN_EMBED_TEXT_V2 using the following code:

const TITAN_EMBED_TEXT_V2 = new bedrock.BedrockFoundationModel(
            "amazon.titan-embed-text-v2:0",
            { supportsKnowledgeBase: true, vectorDimensions: 1024 }, // currently it only supports 1024 as vector dimensions but this uses the titan embed v2 model.
        );

and then use it in your KB:

const kb = new bedrock.KnowledgeBase(this, 'KnowledgeBase', {
  embeddingsModel: TITAN_EMBED_TEXT_V2,
  instruction: 'Use this knowledge base to answer questions about books.
    'It contains the full text of novels.', }
});

krokoko commented 2 months ago

Thanks @amoghgaikwad , the issue for parsing titan embed v2 was indeed fixed by the team, will reopen the PR previously closed. We still cannot specify the dimensions size though.

yuya-tajima commented 2 months ago

@amoghgaikwad @krokoko Sorry for the late reply. Thank you both for your replies.

krokoko commented 2 weeks ago

Dimension size can be specified now, linking this ticket to the PR

awslabs / generative-ai-cdk-constructs