Project ACTIVE as of Oct 22, 2024
This project provides sample code and CDK for deploying APIs which provide a secure and scalable interface with Amazon Bedrock.
This project allows users to deploy a Rest API that supports streaming streaming responses from Amazon Bedrock through an OpenAI interface
It supports additional features such as:
This demo video shows an LLM chatbot powered by the AWS LLM Gateway and Bedrock Streaming.
This Demo video shows the OpenAI Client being used with an LLMGateway API Key to call the LLMGateway to use the Bedrock Claude Haiku LLM
Follow the instructions here to create a certificate with AWS Certificate Manager: https://docs.aws.amazon.com/acm/latest/userguide/gs-acm-request-public.html
Follow the instructions here to validate your domain ownership for your certificate: https://docs.aws.amazon.com/acm/latest/userguide/domain-ownership-validation.html
You will need to make two certificates. They can both be for subdomains under a single domain you own. One will be for the LLMGateway UI (UI_CERT_ARN
and UI_DOMAIN_NAME
in your .env
file), and the other will be for the LLMGateway API (LLM_GATEWAY_CERT_ARN
and LLM_GATEWAY_DOMAIN_NAME
in your .env
file)
Log in to the Azure Portal.
In the Azure Services section, choose Azure Active Directory.
In the left sidebar, choose Enterprise applications.
Choose New application.
On the Browse Azure AD Gallery page, choose Create your own application.
Under What’s the name of your app?, enter a name for your application and select Integrate any other application you don’t find in the gallery (Non-gallery), as shown in Figure 2. Choose Create.
It will take few seconds for the application to be created in Azure AD, then you should be redirected to the Overview page for the newly added application. Note: Occasionally, this step can result in a Not Found error, even though Azure AD has successfully created a new application. If that happens, in Azure AD navigate back to Enterprise applications and search for your application by name.
On the Getting started page, in the Set up single sign on tile, choose Get started, as shown in Figure 3.
On the next screen, select SAML.
Scroll down to the SAML Signing Certificate section, and copy the App Federation Metadata Url by choosing the copy into clipboard icon (highlighted with red arrow in Figure 6). In your .env
file, use this value for the METADATA_URL_COPIED_FROM_AZURE_AD
variable
Complete the deployment steps in the Deployment Steps
section of this ReadMe
In the output of the deployment stack, you will see LlmGatewayStack.EntityId
and LlmGatewayStack.ReplyURL
. Keep these in a text editor, as you'll need them in the next step.
Make sure you're back in the SAML page you were on in steps 9 and 10
In the middle pane under Set up Single Sign-On with SAML, in the Basic SAML Configuration section, choose the edit icon.
In the right pane under Basic SAML Configuration, replace the default Identifier ID (Entity ID) with the LlmGatewayStack.EntityId
you copied previously. In the Reply URL (Assertion Consumer Service URL) field, enter the LlmGatewayStack.ReplyURL
you copied previously, as shown in Figure 4. Choose Save.
In the middle pane under Set up Single Sign-On with SAML, in the User Attributes & Claims section, choose Edit.
Choose Add a group claim.
On the User Attributes & Claims page, in the right pane under Group Claims, select Groups assigned to the application, leave Source attribute as Group ID, as shown in Figure 5. Choose Save.
Go to the url in the LlmGatewayStack.StreamlitUiUrl
stack output, and you should be prompted to log in using your AzureAd credentials. Make sure you have added the user you want to log-in with to the application you created in steps 4-6 (Within the application, go to Users and groups -> Add user/group and then add your desired users)
Deployment Steps
belowCOGNTIO_DOMAIN_PREFIX
>.auth.<Your AWS Region>
.amazoncognito.com/oauth2/idpresponseGIT_HUB_CLIENT_ID
in your .env
file, and the Secret in GIT_HUB_CLIENT_SECRET
in your .env
file2a: Deployment with lambda and API Gateway
https://github.com/TimothyJones/github-cognito-openid-wrapper (IMPORTANT The github-cognito-openid-wrapper allows all individuals with a github account to log into your app by default. To prevent this, you will have to fork the repo, and make edits in order to lock it down, and restrict access only to certain users)GitHubShimIssuer
, which will be a url. Put that url in GIT_HUB_PROXY_URL
in your .env
fileDeployment Steps
belowcd
into cdk
Run cp template.env .env
Set COGNTIO_DOMAIN_PREFIX
to a globally unique alphanumeric string
Set the UI_CERT_ARN
to the ARN of the first certificate you created in the Creating your certificate
section of this ReadMe.
Set the UI_DOMAIN_NAME
to the first sub domain you created in the Creating your certificate
section of this ReadMe.
Set the LLM_GATEWAY_CERT_ARN
to the ARN of the second certificate you created in the Creating your certificate
section of this ReadMe.
Set the LLM_GATEWAY_DOMAIN_NAME
to the second sub domain you created in the Creating your certificate
section of this ReadMe.
If you want to use AzureAd for authentication, follow the steps in the Azure Ad Authentication Steps
section of this ReadMe and make sure to populate the METADATA_URL_COPIED_FROM_AZURE_AD
in your .env
file
If you want to use GitHub for authentication, follow the steps in the GitHub Authentication Steps
section of this ReadMe and make sure to populate GIT_HUB_CLIENT_ID
, GIT_HUB_CLIENT_SECRET
, and GIT_HUB_PROXY_URL
in your .env
file
In ADMIN_LIST
in your .env
file, specify a comma separated list of usernames that you want to have the admin role. The admin role allows you to create usage quotas and model access policies.
Edit any other settings you like. The full list of settings and what they do is below.
Run ./deploy.sh
If you need to make adjustments to your lambda code, simply re-run ./deploy.sh
If you are using AzureAd or GitHub for authentication, skip this step. To use Cognito Authentication against the API Gateway WebSocket, you'll need a Cognito user. Create one with your desired username and password with the python3 create_cognito_user.py
script in the scripts
folder. Once you do that, Streamlit will automatically use the user you created to authenticate to the LLM Gateway.
Go to the url in the LlmGatewayStack.StreamlitUiUrl
stack output
The following are settings which you can configure as needed for your project in your .env
file
COGNTIO_DOMAIN_PREFIX
(Required) Globally unique alphanumeric string that acts as a prefix to your Cognito domain used for authenticationUI_CERT_ARN
(Required) The ARN of the first Amazon Certificate Manager Certificate that you created in the Creating your certificate
of this ReadMe. Certificate for the UI.UI_DOMAIN_NAME
(Required) The first sub domain you created in the Creating your certificate
section of this ReadMe. Domain name for the UI.LLM_GATEWAY_CERT_ARN
(Required) The ARN of the second Amazon Certificate Manager Certificate that you created in the Creating your certificate
of this ReadMe. Certificate for the LLMGateway API.LLM_GATEWAY_DOMAIN_NAME
(Required) The second sub domain you created in the Creating your certificate
section of this ReadMe. Domain name for the LLMGateway API.METADATA_URL_COPIED_FROM_AZURE_AD
(Optional) Field needed for Azure AD Authentication. Detailed in the Azure Ad Authentication Steps
section of this ReadMeGIT_HUB_CLIENT_ID
(Optional) Field needed for GitHub Authentication. Detailed in the GitHub Authentication Steps
section of this ReadMeGIT_HUB_CLIENT_SECRET
(Optional) Field needed for GitHub Authentication. Detailed in the GitHub Authentication Steps
section of this ReadMeGIT_HUB_PROXY_URL
(Optional) Field needed for GitHub Authentication. Detailed in the GitHub Authentication Steps
section of this ReadMeADMIN_LIST
(Optional) Comma separated list of usernames that you want to have the admin role. The admin role allows you to create usage quotas and model access policies.ECR_STREAMLIT_REPOSITORY
(Required) Name of the ECR Repository that will store the Streamlit UI Docker Container ImageECR_API_KEY_REPOSITORY
(Required) Name of the ECR Repository that will store the API Key Management Lambda Function Docker Container ImageECR_LLM_GATEWAY_REPOSITORY
(Required) Name of the ECR Repository that will store the LLMGateway API Docker Container ImageECR_QUOTA_REPOSITORY
(Required) Name of the ECR Repository that will store the Usage Quota Management Lambda Function Docker Container ImageECR_MODEL_ACCESS_REPOSITORY
(Required) Name of the ECR Repository that will store the Model Access Management Lambda Function Docker Container ImageLLM_GATEWAY_IS_PUBLIC
(Required) Whether or not the application load balancer that provides access to the LLMGateway API is accessible from the internet.SERVERLESS_API
(Required) Whether or not the LLMGateway API is Serverless (Lambda) or not (Elastic Container Service (ECS)). Currently, streaming is not supported with Serverless.DEFAULT_QUOTA_FREQUENCY
(Required) The default period over which usage quotas apply. Currently only supports weekly
. This means a user's spending limit resets every week. Will eventually support other time periods.DEFAULT_QUOTA_DOLLARS
(Required) The default amount of money in dollars every user can spend per week.DEFAULT_MODEL_ACCESS
(Required) Comma separeated list of models every user will have access to by defaultDEBUG
(Required) Set to true to enable additional loggingOnce you have gone to the URL in LlmGatewayStack.StreamlitUiUrl
from the stack output in the final step of the Deployment Steps
section, you will enter the UI. Below are images of each of the pages, and what they do
This is the page you will start on. On the right, you can select your provider (only Amazon Bedrock for now), and your LLM Model you want to use. The list of models you see is determined by your Model Access policy set by your admin. On the right you can also see your "Estimated usage for this week", showing you how close you are to exceeding your weekly usage quota, also set by your admin. When you make a request, you can see exactly how much that request cost in the bottom right metrics section
This is the page where you can create an API key in order to use the LLMGateway from outside the main UI. You specify a name for the key, an expiration date (or set to never expire), and click Create
. The key is not saved, so copy it down when it appears on the screen. You can use this API key with an OpenAI Client or tooling in the exact same way you would use a real OpenAI API key, but running against Bedrock.
This is the page where you can view your existing API keys, their status (valid or expired), and when they expire. You can also delete any keys you no longer want
These pages are Admin only. If you are not an Admin, they will not be visible to you, and the corresponding APIs will throw a 403 if you try to use them
This is the page where you can assign a policy to a specific user to determine what Models they can access. You type in the user's username, click Submit
, and then choose from a multi-select drop down menu the models they can use. Then, you click Save Changes
. If you'd like to restore the defaults for that user, you can click Reset To Defaults
This is the page where you can assign a usage quota to a specific user to determine how much money they can spend per week. You type in the user's username, click Submit
. Then, choose from a multi-select drop down menu the frequency of their quota (just weekly for now). Next, you choose the quota limit in dollars that they can spend each week. Finally, you click Save Changes
. If you'd like to restore the defaults for that user, you can click Reset To Defaults
This is the page where you can see how close a user is to exceeding their quota. You type in the user's username, click Submit
. Then you can see details about their quota, how much of it they've consumed, and whether or not they've exceeded it.
To add a new Bedrock Model to the LLM Gateway API, you must do the following:
lambdas/gateway/api/models/bedrock.py
, in the _supported_models
variablelambdas/gateway/api/data/cost_db.csv
Note: you can see the list of models that Bedrock supports by using aws bedrock list-foundation-models
Documentation
This repo has some load testing scripts. These currently are only set up to be used with pure Cognito (without AzureAd or Github Auth enabled). Do the following to perform load testing:
BENCHMARK_MODE
is set to true
in your deployment config file. Benchmark mode deploys a fake Bedrock server and points the LLMGateway at it. This is useful if you want to test the scalability of the LLMGateway beyond your current Bedrock quota limits.METADATA_URL_COPIED_FROM_AZURE_AD
, GIT_HUB_CLIENT_ID
, GIT_HUB_CLIENT_SECRET
, and GIT_HUB_PROXY_URL
are all empty./load_testing
folderconfig.json
file based on the config.template.json
. You can get the client_secret
by going to your userpool in Amazon Cognito in the AWS Console, going to the App Integration
tab, clicking on the ApplicationLoadBalancerClient
at the bottom of the page, and then copying the Client secret
python3 create_cognito_users.py <Number of desired users>
. This will create Cognito users which will call the LLM Gateway during the load test python3 create_api_keys.py
. This will create LLM Gateway API keys for each of your created Cognito userslocust -f llm_gateway_load_testing.py --headless -u <Number of desired users> -r <Number of users to instantiate per second> --run-time <Runtime e.g. 1h>
. See Locust Documentation for more detailsSee CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.