Note: This project is for demonstration only and is not an officially supported Google product. If you're a Googler using this demo, please fill up this form. If you're interested in using our hosted version, please fill up this form.
This project demonstrates production-quality practices for using techniques like Retrieval Augmented Generation (RAG) and ReACT to extend your Gen AI application with information from Cloud Databases.
This demo showcases the Cymbal Air customer service assistant. Cymbal Air is a fictional passenger airline. The assistant is an AI chatbot that helps travellers manage flights and look up information about Cymbal Air's hub at San Francisco International Airport (SFO).
It can help answer users' questions like:
One of the best tools for reducing hallucinations is to use Retrieval Augmented Generation (RAG). RAG is the concept of retrieving some data or information, augmenting your prompt to the LLM, and allowing it to generate more accurate responses based on the data included in the prompt. This grounds the model’s response, making it less likely to hallucinate. This technique is also useful for allowing the LLM to access data it didn’t have when it was trained. And unlike fine-tuning, the information retrieved for RAG does not alter the model or otherwise leave the context of the request - making it more suitable for use cases where information privacy and security are important.
Cloud databases provide a managed solution for storing and accessing data in a scalable and a reliable way. By connecting an LLM to a cloud database, developers can give their applications access to a wider range of information and reduce the risk of hallucinations.
Another increasingly popular technique for LLMs is called ReACT Prompting. ReACT (a combination of “Reason” and “Act”) is a technique for asking your LLM to think through verbal reasoning. This technique establishes a framework for the model (acting as an Agent) to “think aloud” using a specific template - things like “Thoughts”, “Actions”, and “Observations”.
Many platforms support similar patterns to help extend your LLM’s capabilities – Vertex AI has Extensions, LangChain has Tools, and ChatGPT has plugins. We can leverage this pattern to help an LLM understand what information it can access and decide when it needs to access it.
This demo contains 3 key parts:
Running the retrieval service separately (as opposed to in the app itself) can help address a number of challenges
Deploying this demo consists of 3 steps:
Clone this repo to your local machine:
git clone https://github.com/GoogleCloudPlatform/genai-databases-retrieval-app.git
The retrieval service uses an interchangeable 'datastore' interface. Choose one of the databases listed below to set up and initialize your database:
Instructions for deploying the retrieval service
Instructions for running app locally
Instructions for cleaning up resources
This demo can also serve as a starting point for writing your own retrieval service. The directory is organized into the following folders:
Directory | Description |
---|---|
data |
Contains CSV files with the dataset for a working demo. |
llm_demo |
Contains an LLM-based application that uses the retrieval service via multiple orchestrator (e.g. LangChain, VertexAI). |
retrieval_service |
Contains the service for extending an LLM with information from the database. |
You can copy or fork the retrieval_service
folder to customize it to your
needs. There are two main places you want to start:
retrieval_service/app/routes.py
- contains the API endpoints that the LLM
will callretrieval_service/datastore/datastore.py
- contains the interface used to
abstract the database. There are specific implementations of this in the
providers
folder that can be customized with logic for your specific schema.