awslabs / generative-ai-cdk-constructs

AWS Generative AI CDK Constructs are sample implementations of AWS CDK for common generative AI patterns.
https://awslabs.github.io/generative-ai-cdk-constructs/
Apache License 2.0
337 stars 51 forks source link

(bedrock): add inference profiles / cross-region inference #683

Open aws-rafams opened 1 month ago

aws-rafams commented 1 month ago

Describe the feature

An inference profile in the context of Amazon Bedrock is a configuration that allows you to route model inference traffic to multiple AWS Regions for increased availability and throughput. https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference-support.html Screenshot 2024-09-04 at 11 11 30

Use Case

Add inference profiles for improved availability and to access certain models which are only available in certain regions through the usage of inference profiles. Screenshot 2024-09-04 at 11 16 19

Proposed Solution

Define BedrockInferenceProfile as follows:

export enum InferenceProfileRegion {
  /**
   * EU: Frankfurt (eu-central-1), Ireland (eu-west-1), Paris (eu-west-3)
   */
  EU = 'eu',
  /**
   * US: N. Virginia (us-east-1), Oregon (us-west-2)
   */
  US = 'us',
}

export interface InferenceProfileProps {
  /**
   * The geo region where the traffic is going to be distributed.
   */
  readonly region: InferenceProfileRegion;
  /**
   * A model supporting cross-region inference.
   * @see https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference-support.html
   */
  readonly model: BedrockFoundationModel;
}

export class BedrockInferenceProfile extends Resource {
  /**
   * @example 'us.anthropic.claude-3-5-sonnet-20240620-v1:0'
   */
  public readonly profileId: string;
  /**
   * @example 'arn:aws:bedrock:us-east-1:123456789012:inference-profile/us.anthropic.claude-3-5-sonnet-20240620-v1:0'
   */
  public readonly profileArn: string;
  constructor(scope: IConstruct, id: string, props: InferenceProfileProps) {
    super(scope, id);
    this.profileId = `${props.geoRegion}.${props.model.modelId}`;
    this.profileArn = Arn.format({
      service: 'bedrock',
      resource: 'inference-profile',
      resourceName: this.profileId,
      arnFormat: ArnFormat.SLASH_RESOURCE_NAME,
    }, Stack.of(scope));
  }
}

doing so, the definition of the Inference Profile would be as follows:

new BedrockInferenceProfile(this, 'InferenceProfile', {
  region: InferenceProfileRegion.EU,
  model: BedrockFoundationModel.ANTHROPIC_CLAUDE_SONNET_V1_0,
})

Other Information

No response

Acknowledgements

aws-rafams commented 2 weeks ago

Support in Knowledge Bases and thus Prompt Management has arrived. Will create a PR after #668 gets merged.