The following deployable architecture automates the deployment of a sample gen AI Pattern on IBM Cloud, including all underlying IBM Cloud and WatsonX infrastructure. This architecture implements the best practices for watsonx gen AI Pattern deployment on IBM Cloud, as described in the reference architecture.
This deployable architecture provides a comprehensive foundation for trust, observability, security, and regulatory compliance. The architecture configures an IBM Cloud account to align with compliance settings. It also deploys key management and secrets management services and the infrastructure to support continuous integration (CI), continuous delivery (CD), and continuous compliance (CC) pipelines for secure management of the application lifecycle. It also deploys the WatsonX services suite and IBM Cloud Elasticsearch to faciliate a RAG pattern. These pipelines facilitate the deployment of the application, check for vulnerabilities and auditability, and help ensure a secure and trustworthy deployment of generative AI applications on IBM Cloud.
Two variations are available for this deployable architecture:
Basic variation:
Standard variation:
This deployable architecture is designed to showcase a fully automated deployment of a retrieval augmented generation application through IBM Cloud Projects. It provides a flexible and customizable foundation for your own watsonx applications on IBM Cloud. This architecture deploys the following sample application by default.
By using this architecture, you can accelerate your deployment and tailor it to meet your business needs and enterprise goals.
This architecture can help you achieve the following goals:
Before you deploy the deployable architecture, make sure that you complete the following actions:
[!IMPORTANT] You must use an API key that is associated with a user. You can't use service ID keys or trusted profiles.
Create an API key in the target account with the required permissions. The target account is the account that hosts the resources that are deployed by this architecture. For more information, see Managing user API keys.
In test or evaluation environments, you can grant the Administrator role on the following services
User API key creator
role, as it is mandatory for a successful OpenShift cluster deployment.To scope access to be more restrictive for a production environment, refer to the minimum permission level in the permission tab of this deployable architecture.
gpg --gen-key
without a passphrase (if not expired, you can use a previously generated key).gpg --export-secret-key <email address> | base64
. For more information about storing the key, see Generating a GPG key.Select Create new and enter the following details:
Select a region and resource group for the project. For example, for evaluation purposes, you can select the region that is closest to you and the default resource group.
For more information about the enterprise account structures, see the Central administration account white paper.
You can now create your configuration by setting variables.
From the Security panel, select the authentication method that you want to use to deploy your architecture.
Add the API key from the prerequisites in Before you begin.
Enter values for required fields from the Required tab.
Review values for optional fields from the Optional tab:
signing_key
variable from the prerequisites in Before you begin.You can deploy a stacked deployable architecture through the IBM Cloud console in two ways:
By using Auto-deploy: The deployment method can be useful for demonstration and nonproduction environments. With auto-deploy, all the stack member configurations are validated and then approved and deployed.
You can check the Auto-deploy setting for your project by clicking Manage > Settings. By turning on Auto-deploy, you enable the setting for all configurations in the project.
[!TIP] After you approve the configuration, you might receive the error message "Unable to validate your configuration". To resolve the issue, refresh your browser.
You might see "New version available" notifications in the Needs Attention column in your project configuration. You can ignore these messages because they do not prevent you from deploying the stack.
Click the Options icon next to View stack configurations and click Validate.
If the Auto-deploy setting is off in your project, only member configurations that are ready are validated.
In your project, click the Configurations tab.
If the first member configuration of the stack (Account Infrastructure Base
) is not marked as Ready to validate, refresh the page in your browser.
Account Infrastructure Base
row.The Retrieval Augmented Generation Pattern deployable architecture is now deployed in the target account.
After the architecture is deployed, the sample application starts in the newly provisioned DevOps service.
To monitor the build and deployment of the application, follow these steps:
resource_group_name
inputs of the deployable architecture.Workload - Sample RAG App Configuration
row.Outputs
tab, the URL to the deployed application is listed under the sample_app_public_url
output.To minimize costs, the automation deploys a Trial pricing plan of Secrets Manager. You can create only one Trial instance of Secrets Manager. You can deploy a Standard plan instance of Secrets Manager from the Optional settings of the stack.
To fix it, delete the trial instance. After deletion, also delete the service from the reclamation state.
In IBM Cloud, when you delete a resource, it doesn't immediately disappear. Instead, it enters a reclamation state, where it remains for a short time (usually 7 days) before being permanently deleted. During the reclamation state, you can recover the resource, if needed.
Run the following IBM Cloud CLI commands to delete the service from the reclamation state.
The first command lists all the resources in the reclamation state.
# List all the resources in reclamation state with its reclamation ID
ibmcloud resource reclamations
Find the reclamation ID of the Secrets Manager service. Use that ID in the following command.
ibmcloud resource reclamation-delete <reclamation-id>
Many customizations are possible with this architecture. These are some common options.
Each member configuration includes a large number of input parameters. You can edit the configuration to change the default values.
For example, by editing the member configuration, you can accomplish these things:
To edit the member configuration, select Edit from the Options icon in the member configuration row.
You can remove a member configuration from the stack that other configurations don't depend on.
You can remove the following configurations in this architecture:
To remove a member configuration, select Remove from Stack from the Options icon in the member configuration row.
You can add or remove input and output variables at the stack level by following these steps:
You can selectively provision observability resources such as Activity Tracker routes and targets, and Cloud Monitoring instances by following these steps:
cloud_logs_provision
): Set this to provision or skip provisioning an IBM Cloud Logs instance.cloud_monitoring_provision
): Set this to provision or skip provisioning an IBM cloud monitoring instance.enable_at_event_routing_to_cos_bucket
): Set this to enable or disable event routing from Activity Tracker to the Object Storage bucket.enable_at_event_routing_to_cloud_logs
): Set this to enable or disable event routing from Activity Tracker to Cloud Logs.After you modify your deployable architecture in projects, you can share it with others through a private IBM Cloud catalog. To share your deployable architecture, follow the steps in Sharing your deployable architecture to your enterprise.
You can use the code of this sample automation as a guide to customize the sample app to meet your requirements. The code is available at https://github.com/terraform-ibm-modules/terraform-ibm-rag-sample-da.
To use your own app, remove the Workload - Sample RAG App Configuration
member configuration from the stack. This member configuration is specific to the default sample app.
Clean up the configuration
This step is optional if you plan to destroy all Watson resources. The artifacts that are created by the application are deleted as part of undeploying the Watson resources.
Follow the steps outlined in the cleanup.md file to remove the configuration for the sample app.
Delete resources created by the CI toolchain
The following resources, which are created by the toolchain, are not destroyed as part of undeploying the stack in Project.
Delete the project.
To undeploy the infrastructure created by the deployable architecture, follow the steps in Deleting a project in the IBM Cloud docs.