looker-open-source / dashboard-summarization

MIT License
29 stars 20 forks source link

Looker Dashboard Summarization

This is an extension or plugin for Looker that integrates LLM's hosted on Vertex AI into a streaming dashboard summarization experience powered by Websockets.

explore assistant

Description

The Dashboard Summarization extension can be broken down into 3 parts:

  1. Summarization
    • Generates concise summaries on your dashboard's data
  2. Prescription
    • Grounded in your dashboard's data, it can prescribe operational actions and point out outliers
  3. Action
    • Leveraging Looker's API, insights can be exported into the business tools your organization uses

Additionally, the extension provides:

Upcoming capabilities on the roadmap:

Technologies Used

Frontend

Looker

Backend API

Export API's

Setup

simple-architecture

1. Generative AI & Websocket Server

This section describes how to set up the web server on Cloud Run powering the Generative AI and Websocket integrations

Getting Started for Local Development

  1. Clone or download a copy of this repository to your development machine.

    # cd ~/ Optional. your user directory is usually a good place to git clone to.
    git clone https://github.com/looker-open-source/dashboard-summarization.git
  2. Navigate (cd) to the template directory on your system

    cd dashboard-summarization/websocket-service/src
  3. Install the dependencies with NPM.

    npm install

    You may need to update your Node version or use a Node version manager to change your Node version.

  4. Update looker-example.ini to looker.ini and replace environment variables Admin API Credentials. IMPORTANT use a section header that matches the host of your Looker instance. Example below:

Ex: Looker instance -> https://mycompany.cloud.looker.com

   [mycompany]
   base_url=<Your Looker instance URL>
   client_id=<From your looker user's api credentials>
   client_secret=<From your looker user's api credentials>
   verify_ssl=true

This is configured to support deployment to multiple Looker instances reusing the same backend.

  1. Start the development server

    npm run start

    Your development server should be running at http://localhost:5000

Deployment

  1. For deployment you will need to build the docker file and submit it to the Artifact Registry. You need to first create a repository. Update location to your deployment region, then run this command from root

    gcloud artifacts repositories create dashboard-summarization-docker-repo  --repository-format=docker  --location=REGION
  2. Navigate to template directory

    cd dashboard-summarization/websocket-service/src
  3. Update looker-example.ini to looker.ini and replace environment variables Admin API Credentials. IMPORTANT use a section header that matches the host of your Looker instance. Example below:

Ex: Looker instance -> https://mycompany.cloud.looker.com

   [mycompany]
   base_url=<Your Looker instance URL>
   client_id=<From your looker user's api credentials>
   client_secret=<From your looker user's api credentials>
   verify_ssl=true

This is configured to support deployment to multiple Looker instances reusing the same backend.

  1. Update cloudbuild.yaml

    <YOUR_REGION> = Your deployment region
    <YOUR_PROJECT_ID> = Your GCP project ID
  2. Build Docker File and Submit to Artifact Registry, replacing the REGION variable with your deployment region. Skip this step if you already have a deployed image. Please see the official docs for creating the yaml file.

    gcloud auth login && gcloud auth application-default login && gcloud builds submit --region=REGION --config cloudbuild.yaml

    Save the returned docker image url. You can also get the docker image url from the Artifact Registry

  3. Navigate (cd) to the terraform directory on your system

    cd .. && cd terraform
  4. Replace defaults in the variables.tf file for project, region, docker url and service name.

    project_id=<GCP project ID>
    deployment_region=<Your deployement region>
    docker_image=<The docker image url from step 5>
  5. Deploy resources. Ensure Application Default Credentials for GCP for Exported in your Environment first.

    terraform init
    
    terraform plan
    
    terraform apply
  6. Save Deployed Cloud Run URL Endpoint

Optional: Setup Log Sink to BQ for LLM Cost Estimation and Request Logging

This extension will make a call to Vertex for each query in the dashboard and one final call to format all the summaries. Each request is logged with billable characters that can be used to estimate and monitor costs. Please see Google Cloud's docs on setting up a log sink to BQ, using the below filter for Dashboard Summarization Logs (change location and service name if those variables have been updated):

resource.type = "cloud_run_revision"
resource.labels.service_name = "websocket-service"
resource.labels.location = "us-central1"
 severity>=DEFAULT
jsonPayload.component="dashboard-summarization-logs"

2. Looker Extension Framework Setup

Getting Started for Local Development

  1. Clone or download a copy of this repository to your development machine (if you haven't already).

    # cd ~/ Optional. your user directory is usually a good place to git clone to.
    git clone https://github.com/looker-open-source/dashboard-summarization.git
  2. Navigate (cd) to the root directory in the cloned repo

  3. Ensure All the Appropriate Environment Variables are set. Copy .env.example file and save as .env See Export Integration Steps below for Slack and Gchat Variables. These are optional, except WEBSOCKET_SERVICE

    SLACK_CLIENT_ID=
    SLACK_CLIENT_SECRET=
    CHANNEL_ID=
    SPACE_ID=
    WEBSOCKET_SERVICE=<Required: Cloud run endpoint url>
  4. Install the dependencies with NPM.

    npm install

    You may need to update your Node version or use a Node version manager to change your Node version. If you get errors installing dependencies, you may try

    npm install --legacy-peer-deps
  5. Start the development server

    npm run develop

    Great! Your extension is now running and serving the JavaScript at http://localhost:8080/bundle.js.

  6. Now log in to Looker and create a new project.

    This is found under Develop => Manage LookML Projects => New LookML Project.

    You'll want to select "Blank Project" as your "Starting Point". You'll now have a new project with no files.

    1. In your copy of the extension project you have a manifest.lkml file.

    You can either drag & upload this file into your Looker project, or create a manifest.lkml with the same content. Change the id, label, or url as needed.

    project_name: "dashboard-summarization-extension"

    application: dashboard-summarization {
      label: "Dashboard Insights Powered by Vertex AI"
      # file: "bundle.js"
      url: "http://localhost:8080/bundle.js"
      mount_points: {
        dashboard_vis: yes
        dashboard_tile: yes
        standalone: yes
      }
      entitlements: {
        local_storage: yes
        use_form_submit: yes
        core_api_methods: ["run_inline_query","all_lookml_models","dashboard","dashboard_dashboard_elements"]
        external_api_urls: [
       "YOUR CLOUD RUN URL","http://localhost:5000","http://localhost:3000","https://*.googleapis.com","https://slack.com/api/*","https://slack.com/*"
      ]
        oauth2_urls: [
          "https://accounts.google.com/o/oauth2/v2/auth",
          "https://www.googleapis.com/auth/chat.spaces",
          "https://www.googleapis.com/auth/drive.metadata.readonly",
          "https://www.googleapis.com/auth/spreadsheets.readonly",
          "https://www.googleapis.com/auth/userinfo.profile",
          "https://www.googleapis.com/auth/chat.spaces.readonly",
          "https://www.googleapis.com/auth/chat.bot",
          "https://www.googleapis.com/auth/chat.messages",
          "https://www.googleapis.com/auth/chat.messages.create",
          "https://slack.com/oauth/v2/authorize"
        ]
      }
    }
  7. Create a model LookML file in your project. The name doesn't matter. The model and connection won't be used, and in the future this step may be eliminated.

    • Add a connection in this model. It can be any connection, it doesn't matter which.
    • Configure the model you created so that it has access to some connection.
  8. Connect your new project to Git. You can do this multiple ways:

    • Create a new repository on GitHub or a similar service, and follow the instructions to connect your project to Git
    • A simpler but less powerful approach is to set up git with the "Bare" repository option which does not require connecting to an external Git Service.
  9. Commit your changes and deploy your them to production through the Project UI.

  10. Reload the page and click the Browse dropdown menu. You should see your extension in the list.

    • The extension will load the JavaScript from the url provided in the application definition. By default, this is https://localhost:8080/bundle.js. If you change the port your server runs on in the package.json, you will need to also update it in the manifest.lkml.

Deployment

The process above requires your local development server to be running to load the extension code. To allow other people to use the extension, a production build of the extension needs to be run. As the kitchensink uses code splitting to reduce the size of the initially loaded bundle, multiple JavaScript files are generated.

  1. In your extension project directory on your development machine, build the extension by running the command npm run build.
  2. Drag and drop the generated JavaScript file(bundle.js) contained in the dist directory into the Looker project interface.
  3. Modify your manifest.lkml to use file instead of url and point it at the bundle.js file.

Note that the additional JavaScript files generated during the production build process do not have to be mentioned in the manifest. These files will be loaded dynamically by the extension as and when they are needed. Note that to utilize code splitting, the Looker server must be at version 7.21 or above.

3. [Optional] Export Integration Setup

Slack OAuth Setup

  1. Follow the official Slack developer docs to setup an OAuth Application
  2. Acquire a SLACK_CLIENT_ID and SLACK_CLIENT_SECRET from the OAuth app created in Step 1 and add them to the .env file.
  3. Attach the appropriate User & Bot Scopes (recommended to at least have channels:read and channels:write)
  4. [Optional] if making Bot requests, add the bot to channels you want it accessing.

To note, the Slack integration hardcodes a specific channel id in the code. These can be modified or an additional API request made to provide a channel selector experience.

Google Chat OAuth Setup

  1. Follow the official Google Chat developer docs to setup an OAuth Application
  2. Acquire a GOOGLE_CLIENT_ID from the OAuth app created in Step 1 and add them to the .env file.
  3. Configure a Google Chat Bot to send messages (this bot is only used for message ownership and not used to call the Google Chat API)
  4. Add bot to specific Google Chat Spaces.

To note, the Google Chat Integration hardcodes a specific space id in the code. These can be modified or an additional API request made to provide a space selector experience.


Recommendations for fine tuning the model

This app uses a one shot prompt technique for fine tuning the LLM, meaning that all the metadata for the dashboard and request is contained in the prompt. To improve the accuracy, detail, and depth of the summaries and prescriptive steps returned by the LLM please pass as much context about the dashboard and the general recommendation themes in the prompt sent to the model. This can all be done through Looker as opposed to hard coded in the Cloud Run Service. Details below: