The following deployable architecture automates the deployment of a sample GenAI Pattern on IBM Cloud, including all underlying IBM Cloud infrastructure. This architecture implements the best practices for Watsonx GenAI Pattern deployment on IBM Cloud, as described in the reference architecture.
This deployable architecture provides a comprehensive foundation for trust, observability, security, and regulatory compliance by configuring the IBM Cloud account to align with compliance settings, deploying key and secret management services, and deploying the infrastructure to support CI/CD/CC pipelines for secure application lifecycle management. These pipelines facilitate the deployment of the application, vulnerability checks, and auditability, ensuring a secure and trustworthy deployment of Generative AI applications on IBM Cloud.
This deployable architecture is designed to showcase a fully automated deployment of a retrieval augmented generation application through IBM Cloud Project, providing a flexible and customizable foundation for your own Watson-based application deployments on IBM Cloud. This architecture deploys the following sample application by default.
By leveraging this architecture, you can accelerate your deployment and tailor it to meet your unique business needs and enterprise goals.
By using this architecture, you can:
To deploy this architecture, follow these steps.
Before deploying the deployable architecture, ensure you have:
Administrator
role on IAM Identity Service
, All Identity and Access enabled services
and All Account Management
services. If you need to narrow down further access, for a production environment for instance, the minimum level of permissions is indicated in the Permission tab of the deployable architecture.gpg --gen-key
without passphrase (if not generated before or expired) and then exported via gpg --export-secret-key <Email Address> | base64
command. See the devsecops image signing page for details. Keep note of the key for later. The signing key is not required to deploy all of the Cloud resources created by this deployable architecture, but is necessary to get the automation to build and deploy the sample application.ibmcloud plugin install project
command. More information is available here.Ensure that you are familiar with the "Important Deployment Considerations" located at the bottom of this document.
Locate the tile for the Deployable Architecture in the IBM Cloud Catalog.
Click the "Add to project" button.
Select Create new and enter the following details:
Configuration Name (name of the automation in the project, e.g., "RAG", "dev" or "prod", ideally matching the deployment target, but this can be any name)
Click the Add button (or Create if this is the first project in the account) at the bottom right of the modal popup to complete.
After completing Step 2 - Deploy the Stack in a New Project from Catalog
, you are directed to a page allowing you to enter the configuration for you deployment:
api_key
field.
You may explore the other available inputs, such as the region and resource group name (under optional tab), leave them as is, or modify them as needed.
Once ready, click the "Save" button at the top of the screen.
Navigate to the project deployment view by clicking the project name in the breadcrumb menu.
You should be directed to a screen looking like:
Note: in some rare occurences, the first member of the stack may not be marked as "Ready to validate". Refreshing the page in your browser window should solve this problem.
Two approaches to deploy the architecture:
To enable auto-deployment:
The project will then validate, approve, and deploy each stack member, taking approximately one hour to complete.
Click on validate
Wait for validation
Approve and click the deploy button
Wait for deployment
Repeat step 1 for the next configuration in the architecture. Note that as you progress in deploying the initial base configuration, you will be given the option to validate and deploy multiple configuration in parallel.
At this point, the infrastructure has been successfully deployed in the target account, and the initial build of the sample application has started in the newly-provisioned DevOps service.
To monitor the build and deployment of the application, follow these steps:
Once the initial run of the CI pipeline complete, you should be able to view the application running in the created Code Engine project.
After the application has been built and is running in Code Engine, there are additional steps specific to the sample app that need to be completed to fully enable Watson Assistant in the app. To complete the installation, follow the steps outlined in the application README.md file.
The deployable architecture can only be deployed with an API Key associated with a user. It is not compatible with API Keys associated with a serviceId. Additionally, it cannot be deployed using the Project trusted profile support.
You may see notifications in IBM Cloud Project indicating that one or more configurations in the stack have new versions available. You can safely ignore these messages at this point, as they will not prevent you from deploying the stack. No specific action is required from you.
Please note that these notifications are expected, as we are rapidly iterating on the development of the underlying components. As new stack versions become available, the versions of the underlying components will also be updated accordingly.
The automation is configured to deploy a Trial version of Secret Manager by default to minimize costs. However, the Trial version has some limitations. If you want to avoid these limitations, you can opt to deploy a standard (paid) instance of Secret Manager under the Optional settings of the stack.
Here are the limitations of the Trial version:
What are reclamations? In IBM Cloud, when you delete a resource, it doesn't immediately disappear. Instead, it enters a "reclamation" state, where it remains for a short period of time (usually 7 days) before being permanently deleted. During this time, you can still recover the resource if needed.
To resolve the re-deployment failure, you will need to delete the Secret Manager service from the reclamation state by running the following commands:
ibmcloud resource reclamations # lists all the resources in reclamation state, get the reclamation ID of the secret manager service
ibmcloud resource reclamation-delete <reclamation-id>
There are numerous customization possibilities available out of the box. This section explores some common scenarios, but is not exhaustive.
Each configuration in the deployed stack surfaces a large number of input parameters. You can directly edit each parameter to tailor your deployment by selecting the Edit option in the menu for the corresponding configuration on the right-hand side.
This approach enables you to:
You can remove any configuration from the stack, provided there is no direct dependency in later configurations, by selecting the Remove from Stack option in the right-hand side menu for the corresponding configuration.
This applies to the following configurations:
You can add or remove inputs and outputs surfaced at the stack level by following these steps:
Select the stack configuration
You are presented with a screen allowing you to promote any of the configuration inputs or outputs at the stack level
Once you have made modifications to your stack in Project, you can share it with others through a private IBM Cloud Catalog. To do so, follow these steps:
This will allow you to share your modified stack with others through a private IBM Cloud Catalog.
As you deploy your own application, you may want to remove the last configuration (Sample RAG app configuration), which is specific to the sample app provided out of the box. You can use the code of this sample automation as a guide to implement your own, depending on your application needs. The code is available at https://github.com/terraform-ibm-modules/terraform-ibm-rag-sample-da.
This step is optional if you are planning to fully destroy all Watson resources. The artifacts created by the application will be deleted as part of undeploying the Watson resources.
Follow the steps outlined in the cleanup.md file file to remove the configuration specific to the sample app.
To undeploy the infrastructure created by the automation, complete the following steps:
Those resources are not destroyed automatically as part of undeploying the stack in Project:
Select "Undeploy" option in the menu associated with the stack in the project.
Once all configurations are undeployed, you may delete the project.