awslabs / generative-ai-cdk-constructs

AWS Generative AI CDK Constructs are sample implementations of AWS CDK for common generative AI patterns.
https://awslabs.github.io/generative-ai-cdk-constructs/
Apache License 2.0
337 stars 51 forks source link

feat(opensearch serverless): analyzer #537

Closed statefb closed 3 months ago

statefb commented 3 months ago

close #536

Add analyzers support for the construct: https://opensearch.org/docs/latest/analyzers/


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

statefb commented 3 months ago

example usage:

import { bedrock } from "@cdklabs/generative-ai-cdk-constructs";
import {
  CharacterFilterType,
  TokenFilterType,
  TokenizerType,
} from "@cdklabs/generative-ai-cdk-constructs/lib/cdk-lib/opensearchserverless";

const kb = new bedrock.KnowledgeBase(this, "KB", {
    embeddingsModel: bedrock.BedrockFoundationModel.TITAN_EMBED_TEXT_V1,
    instruction:
      "Use this knowledge base to answer questions. Please quote the reference to explain your answers.",
    analyzer: {
      characterFilters: [CharacterFilterType.ICU_NORMALIZER],
      tokenizer: TokenizerType.KUROMOJI_TOKENIZER,
      tokenFilters: [
        TokenFilterType.KUROMOJI_BASEFORM,
        TokenFilterType.JA_STOP,
      ],
    }
)
krokoko commented 3 months ago

Also, could you please update the code snippet here: https://github.com/awslabs/generative-ai-cdk-constructs/blob/main/src/cdk-lib/opensearch-vectorindex/README.md#vector-index with the new analyzer props

krokoko commented 3 months ago

Build is not passing: https://github.com/awslabs/generative-ai-cdk-constructs/actions/runs/9752808672/job/26916988890?pr=537

krokoko commented 3 months ago

Build is not passing: https://github.com/awslabs/generative-ai-cdk-constructs/actions/runs/9752808672/job/26916988890?pr=537

fixed

krokoko commented 3 months ago

Great thank you for your contribution @statefb ! I approved, once a second reviewer will approve we will merge