aws-amplify / docs

AWS Amplify Framework Documentation
https://docs.amplify.aws
Apache License 2.0
487 stars 1.06k forks source link

AI kit does not support Cross-region inference #8121

Open rpostulart opened 1 day ago

rpostulart commented 1 day ago

Environment information

System:
  OS: macOS 14.6.1
  CPU: (16) x64 Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
  Memory: 94.01 MB / 16.00 GB
  Shell: /bin/zsh
Binaries:
  Node: 22.9.0 - /usr/local/bin/node
  Yarn: undefined - undefined
  npm: 10.8.3 - /usr/local/bin/npm
  pnpm: undefined - undefined
NPM Packages:
  @aws-amplify/auth-construct: 1.5.0
  @aws-amplify/backend: 1.8.0
  @aws-amplify/backend-auth: 1.4.1
  @aws-amplify/backend-cli: 1.4.2
  @aws-amplify/backend-data: 1.2.1
  @aws-amplify/backend-deployer: 1.1.9
  @aws-amplify/backend-function: 1.8.0
  @aws-amplify/backend-output-schemas: 1.4.0
  @aws-amplify/backend-output-storage: 1.1.3
  @aws-amplify/backend-secret: 1.1.5
  @aws-amplify/backend-storage: 1.2.3
  @aws-amplify/cli-core: 1.2.0
  @aws-amplify/client-config: 1.5.2
  @aws-amplify/deployed-backend-client: 1.4.2
  @aws-amplify/form-generator: 1.0.3
  @aws-amplify/model-generator: 1.0.9
  @aws-amplify/platform-core: 1.2.1
  @aws-amplify/plugin-types: 1.5.0
  @aws-amplify/sandbox: 1.2.6
  @aws-amplify/schema-generator: 1.2.5
  aws-amplify: 6.9.0
  aws-cdk: 2.169.0
  aws-cdk-lib: 2.169.0
  typescript: 5.6.3
No AWS environment variables
No CDK environment variables

Describe the bug

In the schema I can only define model like

const schema = a.schema({
  chat: a
    .conversation({
      aiModel: a.ai.model("Claude 3.5 Sonnet"),
      systemPrompt: `You are a very helpful assistant`,
    })
    .authorization((allow) => allow.owner()),
});

But i get an error in my region because it only allows access to Claude 3.5 via a inference profile. This is the error in de lambda:

{
    "timestamp": "2024-11-21T22:03:13.695Z",
    "level": "ERROR",
    "requestId": "37377ab9-f496-4cc3-b5a1-70115062ea0f",
    "message": "Failed to handle conversation turn event, currentMessageId=2ef0c357-e7ab-45b6-92be-71b855c65597, conversationId=411d1032-0704-4384-8952-5db49f27e5b1 ValidationException: Invocation of model ID anthropic.claude-3-5-sonnet-20240620-v1:0 with on-demand throughput isn’t supported. Retry your request with the ID or ARN of an inference profile that contains this model.\n    at de_ValidationExceptionRes (/var/runtime/node_modules/@aws-sdk/client-bedrock-runtime/dist-cjs/index.js:1195:21)\n    at de_CommandError (/var/runtime/node_modules/@aws-sdk/client-bedrock-runtime/dist-cjs/index.js:1028:19)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/middleware-serde/dist-cjs/index.js:35:20\n    at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/core/dist-cjs/index.js:165:18\n    at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/middleware-retry/dist-cjs/index.js:320:38\n    at async /var/runtime/node_modules/@aws-sdk/middleware-logger/dist-cjs/index.js:34:22\n    at async BedrockConverseAdapter.askBedrockStreaming (/var/task/index.js:813:29)\n    at async ConversationTurnExecutor.execute (/var/task/index.js:1009:32)\n    at async Runtime.handleConversationTurnEvent [as handler] (/var/task/index.js:1043:7) {\n  '$fault': 'client',\n  '$metadata': {\n    httpStatusCode: 400,\n    requestId: 'dbb16273-798b-4bad-946b-ac30835b2c0f',\n    extendedRequestId: undefined,\n    cfId: undefined,\n    attempts: 1,\n    totalRetryDelay: 0\n  }\n}",
    "errorType": "ValidationException",
    "errorMessage": "Invocation of model ID anthropic.claude-3-5-sonnet-20240620-v1:0 with on-demand throughput isn’t supported. Retry your request with the ID or ARN of an inference profile that contains this model.",
    "stackTrace": [
        "ValidationException: Invocation of model ID anthropic.claude-3-5-sonnet-20240620-v1:0 with on-demand throughput isn’t supported. Retry your request with the ID or ARN of an inference profile that contains this model.",
        "    at de_ValidationExceptionRes (/var/runtime/node_modules/@aws-sdk/client-bedrock-runtime/dist-cjs/index.js:1195:21)",
        "    at de_CommandError (/var/runtime/node_modules/@aws-sdk/client-bedrock-runtime/dist-cjs/index.js:1028:19)",
        "    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)",
        "    at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/middleware-serde/dist-cjs/index.js:35:20",
        "    at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/core/dist-cjs/index.js:165:18",
        "    at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/middleware-retry/dist-cjs/index.js:320:38",
        "    at async /var/runtime/node_modules/@aws-sdk/middleware-logger/dist-cjs/index.js:34:22",
        "    at async BedrockConverseAdapter.askBedrockStreaming (/var/task/index.js:813:29)",
        "    at async ConversationTurnExecutor.execute (/var/task/index.js:1009:32)",
        "    at async Runtime.handleConversationTurnEvent [as handler] (/var/task/index.js:1043:7)"
    ]
}

Reproduction steps

Watched the cloudlogs errors, because I didn't get a response in the front end.

rpostulart commented 16 hours ago

It's unclear in the docs, but you can solve it by putting the inference profile id in the resourcepath:

const schema = a.schema({
  chat: a
    .conversation({
      aiModel: {
        resourcePath: "eu.anthropic.claude-3-5-sonnet-20240620-v1:0",
      },

      systemPrompt: `You are a very helpful assistant`,
    })
    .authorization((allow) => allow.owner()),
});

but you also need to update your lambda (that is invoking bedrock) resource policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": [
                "arn:aws:bedrock:*::foundation-model/anthropic.claude-3-5-sonnet-20240620-v1:0",
                "arn:aws:bedrock:*:114244416074:inference-profile/anthropic.claude-3-5-sonnet-20240620-v1:0",
                "arn:aws:bedrock:*:114244416074:inference-profile/eu.anthropic.claude-3-5-sonnet-20240620-v1:0"
            ],
            "Effect": "Allow"
        }
    ]
}

Would be great if this can be adjusted

ykethan commented 8 hours ago

Hey, thanks for the feedback and information on how this can be solved. Transferring the issue to the documentation repository for updates.

atierian commented 7 hours ago

Thanks for opening this @rpostulart. We'll add an example to the docs. In the meantime, here's an example of how you can do this directly within your Amplify backend.

Add a custom conversation handler

Add the @aws-amplify/backend-ai package

npm install @aws-amplify/backend-ai

In amplify/data/resource.ts

import { defineConversationHandlerFunction } from "@aws-amplify/backend-ai/conversation";

export const crossRegionModel = `eu.${model}`;
export const model = 'anthropic.claude-3-5-sonnet-20240620-v1:0';

export const conversationHandler = defineConversationHandlerFunction({
  entry: "./conversationHandler.ts",
  name: "conversationHandler",
  models: [{ modelId: crossRegionModel }],
});

const schema = a.schema({
  chat: a.conversation({
    aiModel: {
      resourcePath: crossRegionModel,
    },
    systemPrompt: 'You are a helpful assistant.',
    handler: conversationHandler,
  )}
    .authorization((allow) => allow.owner())
});

Create a new file amplify/data/conversationHandler.ts

import { handleConversationTurnEvent } from '@aws-amplify/backend-ai/conversation/runtime';

export const handler = handleConversationTurnEvent;

In amplify/backend.ts

import { defineBackend } from "@aws-amplify/backend";
import { auth } from "./auth/resource";
import { data, conversationHandler, crossRegionModel, model } from "./data/resource";
import { PolicyStatement } from "aws-cdk-lib/aws-iam";

const backend = defineBackend({
  auth,
  data,
  conversationHandler,
});

// This policy statement assumes that you're deploying in `eu-west-1`. 
// If that's not the case, adjust the resources block in the policy statements accordingly.
backend.conversationHandler.resources.lambda.addToRolePolicy(
  new PolicyStatement({
    resources: [
      `arn:aws:bedrock:eu-west-1:[account-number]:inference-profile/${crossRegionModel}`,
      `arn:aws:bedrock:eu-west-1::foundation-model/${model}`,
      `arn:aws:bedrock:eu-west-3::foundation-model/${model}`,
      `arn:aws:bedrock:eu-central-1::foundation-model/${model}`,
    ],
    actions: [
      'bedrock:InvokeModelWithResponseStream'
    ],
  })
);
rpostulart commented 6 hours ago

ok this is great. I will close the issue with your commitment you will update the docs :)