Retrieval Augmented Generation Pattern for Watsonx on IBM Cloud

The following deployable architecture automates the deployment of a sample GenAI Pattern on IBM Cloud, including all underlying IBM Cloud infrastructure. This architecture implements the best practices for Watsonx GenAI Pattern deployment on IBM Cloud, as described in the reference architecture.

This deployable architecture provides a comprehensive foundation for trust, observability, security, and regulatory compliance by configuring the IBM Cloud account to align with compliance settings, deploying key and secret management services, and deploying the infrastructure to support CI/CD/CC pipelines for secure application lifecycle management. These pipelines facilitate the deployment of the application, vulnerability checks, and auditability, ensuring a secure and trustworthy deployment of Generative AI applications on IBM Cloud.

Objective and Benefits

This deployable architecture is designed to showcase a fully automated deployment of a retrieval augmented generation application through IBM Cloud Project, providing a flexible and customizable foundation for your own Watson-based application deployments on IBM Cloud. This architecture deploys the following sample application by default.

By leveraging this architecture, you can accelerate your deployment and tailor it to meet your unique business needs and enterprise goals.

By using this architecture, you can:

Establish Trust: The architecture ensures trust by configuring the IBM Cloud account to align with compliance settings as defined in the Financial Services framework.
Ensure Observability: The architecture provides observability by deploying services such as IBM Log Analysis, IBM Monitoring, IBM Activity Tracker, and log retention through Cloud Object Storage buckets.
Implement Security: The architecture ensures security by deploying IBM Key Protect and IBM Secrets Manager.
Achieve Regulatory Compliance: The architecture ensures regulatory compliance by implementing CI/CD/CC pipelines, along with IBM Security Compliance Center (SCC) for secure application lifecycle management.

Deployment Details

To deploy this architecture, follow these steps.

1. Prerequisites

Before deploying the deployable architecture, ensure you have:

Created an API key in the target account with sufficient permissions. The target account is the account that will be hosting the resources deployed by this deployable architecture. See instructions Note the API key, as it will be used later. On evaluation environments, you may simply grant Administrator role on IAM Identity Service, All Identity and Access enabled services and All Account Management services. If you need to narrow down further access, for a production environment for instance, the minimum level of permissions is indicated in the Permission tab of the deployable architecture.
(Recommended to ensure successful sample app deployment) Created or have access to a signing key, which is the base64 key obtained from gpg --gen-key without passphrase (if not generated before or expired) and then exported via gpg --export-secret-key <Email Address> | base64 command. See the devsecops image signing page for details. Keep note of the key for later. The signing key is not required to deploy all of the Cloud resources created by this deployable architecture, but is necessary to get the automation to build and deploy the sample application.
(Optional) Installed the IBM Cloud CLI's Project add-on using the ibmcloud plugin install project command. More information is available here.

Ensure that you are familiar with the "Important Deployment Considerations" located at the bottom of this document.

2. Deploy the Stack in a New Project from Catalog

Locate the tile for the Deployable Architecture in the IBM Cloud Catalog.
Click the "Add to project" button.
Select Create new and enter the following details:
- Name and Description (e.g., "Retrieval Augmented Generation Pattern")
- Region and Resource Group for the project. e.g. for evaluation purposes, you may select the region the closest to you, and the Default resource group. For more insights on the recommended production topology, refer to the Enterprise account architecture Central administration account white paper.
- Configuration Name (name of the automation in the project, e.g., "RAG", "dev" or "prod", ideally matching the deployment target, but this can be any name)
Click the Add button (or Create if this is the first project in the account) at the bottom right of the modal popup to complete.

3. Set the Input Configuration for the Stack

After completing Step 2 - Deploy the Stack in a New Project from Catalog, you are directed to a page allowing you to enter the configuration for you deployment:

Under Security -> Authentication, enter the API Key from the prereqs in the api_key field.
Under Required, input a prefix. This prefix will be appended to the name of most resources created by automation, ensuring uniqueness and avoiding clashes when provisioning names in the same account.
Under Optional, input the signing_key field. While not necessary for deploying Cloud resources, it is recommended and required to enable the building and deployment of the sample app.

You may explore the other available inputs, such as the region and resource group name (under optional tab), leave them as is, or modify them as needed.

Once ready, click the "Save" button at the top of the screen.

4. Deploy the Architecture

Navigate to the project deployment view by clicking the project name in the breadcrumb menu.

You should be directed to a screen looking like:

validate

Note: in some rare occurences, the first member of the stack may not be marked as "Ready to validate". Refreshing the page in your browser window should solve this problem.

Two approaches to deploy the architecture:

Fully Automated End-to-End. Recommended for demo or non-critical environments. This approach allows Project to validate, approve, and deploy all stack members automatically.
Member-by-Member. Recommended for critical environments, such as production. This approach enables a detailed review of changes from each stack member before automation is executed, ensuring precise control over the deployment process.

Approach 1: Fully Automated End-to-End

To enable auto-deployment:

Go to Manage > Settings > Auto-deploy and toggle On.
Return to the Configurations tab and click Validate under stack configuration.

The project will then validate, approve, and deploy each stack member, taking approximately one hour to complete.

Approach 2: Member-by-Member

Click on validate
Wait for validation
Approve and click the deploy button
Wait for deployment
Repeat step 1 for the next configuration in the architecture. Note that as you progress in deploying the initial base configuration, you will be given the option to validate and deploy multiple configuration in parallel.

5. Post deployment steps

At this point, the infrastructure has been successfully deployed in the target account, and the initial build of the sample application has started in the newly-provisioned DevOps service.

Monitoring the Build and Deployment

To monitor the build and deployment of the application, follow these steps:

Access the DevOps Toolchains View: Navigate to the DevOps / Toolchains view in the target account.
Select the Resource Group and Region: Choose the resource group and region where the infrastructure was deployed. The resource group name is based on the prefix and resource_group_name inputs of the deployable architecture.
Select the Toolchain: Select "RAG Sample App-CI-Toolchain"
Access the Delivery Pipeline: In the toolchain view, select ci-pipeline under Delivery pipeline
View the CI Pipeline Status: The current status of the CI pipeline execution can be found under the "rag-webhook-trigger" section.

Verifying the Application Deployment

Once the initial run of the CI pipeline complete, you should be able to view the application running in the created Code Engine project.

Enabling Watson Assistant

After the application has been built and is running in Code Engine, there are additional steps specific to the sample app that need to be completed to fully enable Watson Assistant in the app. To complete the installation, follow the steps outlined in the application README.md file.

6. Important Deployment Considerations

API Key Requirements

The deployable architecture can only be deployed with an API Key associated with a user. It is not compatible with API Keys associated with a serviceId. Additionally, it cannot be deployed using the Project trusted profile support.

Notification of New Configuration Versions ("Needs Attention")

You may see notifications in IBM Cloud Project indicating that one or more configurations in the stack have new versions available. You can safely ignore these messages at this point, as they will not prevent you from deploying the stack. No specific action is required from you.

new version

Please note that these notifications are expected, as we are rapidly iterating on the development of the underlying components. As new stack versions become available, the versions of the underlying components will also be updated accordingly.

Limitations with the Trial Secret Manager Offering

The automation is configured to deploy a Trial version of Secret Manager by default to minimize costs. However, the Trial version has some limitations. If you want to avoid these limitations, you can opt to deploy a standard (paid) instance of Secret Manager under the Optional settings of the stack.

Here are the limitations of the Trial version:

Account limitation: Only one Trial instance of Secret Manager can be deployed at a time in a given account.
Deployment error: You will encounter an error in the Secret Manager deployment step if there is already a Trial instance deployed in the same account.
Re-deployment failure: If the automation provisions a Trial version of Secrets Manager, and is un-deployed and then re-deployed again with the Trial version in the same account, the "2b - Security Service - Secret Manager" deployment will fail. This is because you can only have one Trial version of Secrets Manager in an account, and even after deletion, the prior Trial version of Secrets Manager needs to be removed from the "reclamation" state as well.

What are reclamations? In IBM Cloud, when you delete a resource, it doesn't immediately disappear. Instead, it enters a "reclamation" state, where it remains for a short period of time (usually 7 days) before being permanently deleted. During this time, you can still recover the resource if needed.

To resolve the re-deployment failure, you will need to delete the Secret Manager service from the reclamation state by running the following commands:

ibmcloud resource reclamations #  lists all the resources in reclamation state, get the reclamation ID of the secret manager service
ibmcloud resource reclamation-delete <reclamation-id>

Customization options

There are numerous customization possibilities available out of the box. This section explores some common scenarios, but is not exhaustive.

Editing Individual Configurations

Each configuration in the deployed stack surfaces a large number of input parameters. You can directly edit each parameter to tailor your deployment by selecting the Edit option in the menu for the corresponding configuration on the right-hand side.

edit config

This approach enables you to:

Fine-tune account settings
Deploying additional Watson components, such as Watsonx Governance
Deploy to an existing resource group
Reuse existing key protect keys
Tuning the parameter of the provisioned code engine project
...

Removing Configurations from the Stack

You can remove any configuration from the stack, provided there is no direct dependency in later configurations, by selecting the Remove from Stack option in the right-hand side menu for the corresponding configuration.

This applies to the following configurations:

Observability
Security and Control Center

edit config

Managing Stack-Level Inputs and Outputs

You can add or remove inputs and outputs surfaced at the stack level by following these steps:

Select the stack configuration
You are presented with a screen allowing you to promote any of the configuration inputs or outputs at the stack level

Sharing Modified Stacks through a Private IBM Cloud Catalog

Once you have made modifications to your stack in Project, you can share it with others through a private IBM Cloud Catalog. To do so, follow these steps:

Deploy the stack at least once: You need to deploy the stack first to allow importing the stack definition to a private catalog.
Select the "Add to private catalog" option in the menu located on the stack configuration.

This will allow you to share your modified stack with others through a private IBM Cloud Catalog.

Customizing for Your Application

As you deploy your own application, you may want to remove the last configuration (Sample RAG app configuration), which is specific to the sample app provided out of the box. You can use the code of this sample automation as a guide to implement your own, depending on your application needs. The code is available at https://github.com/terraform-ibm-modules/terraform-ibm-rag-sample-da.

Undeploying/Deleting the Stack, and all associated Infrastructure Resources

Cleanup the configuration

This step is optional if you are planning to fully destroy all Watson resources. The artifacts created by the application will be deleted as part of undeploying the Watson resources.

Follow the steps outlined in the cleanup.md file file to remove the configuration specific to the sample app.

Undeploying Infrastructure

To undeploy the infrastructure created by the automation, complete the following steps:

1. Delete Resources Created by the CI toolchain

Those resources are not destroyed automatically as part of undeploying the stack in Project:

Code Engine Project: Delete the code engine project created for the sample application.
Container Registry Namespace: Delete the container registry namespace created by the CI tookchain.

2. Undeploy Configurations in the Project

Select "Undeploy" option in the menu associated with the stack in the project.

3. Delete Project

Once all configurations are undeployed, you may delete the project.

terraform-ibm-modules / stack-retrieval-augmented-generation

readme