amundsen-io / amundsen

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
https://www.amundsen.io/amundsen/
Apache License 2.0
4.41k stars 956 forks source link

Feature Proposal: Tableau dashboard extractors for databuilder #565

Closed ccarterlandis closed 3 years ago

ccarterlandis commented 4 years ago

Expected Behavior or Use Case

With the recent addition of dashboard support for Amundsen, it's now possible to build extractors for Tableau dashboards and visualizations. Since Tableau is widely used as a data visualization and analysis tool, having the ability to index these Tableau dashboards inside Amundsen gives better context for how the data is actually being used and enables users to discover and share dashboards and visualizations that have already been built.

Service or Ingestion ETL

These extractors would be implemented in the amundsendatabuilder module. Currently, the extractors would not require changes to any other Amundsen module.

Possible Implementation

This proposal is currently a work in progress. You can track the progress here: https://github.com/lyft/amundsendatabuilder/pull/303

Overview

The extractors are built around Tableau workbooks being the Amundsen equivalent of a dashboard. The extractors utilize Tableau's Metadata API to query information about workbooks and their associated entities like projects (dashboard_groups), custom SQL queries (dashboard_query), and sheets/dashboards within the workbooks (dashboard_chart).

Relations between the Amundsen dashboard model and Tableau

Luckily, the Tableau Metadata API uses a GraphQL schema for querying, so retrieving the data and loading it into Neo4j's GraphQL schema is relatively straightforward. However, there are a few notable differences in the conceptual models that need to be addressed:

Technical notes

Context

@alevene and I are building this integration on behalf of Gusto. For Gusto's use case, we are interested in exposing Tableau resources in Amundsen to better facilitate the discovery of existing dashboard resources, so we can avoid duplicate dashboard development and to provide background on the provenance/lineage of the dashboards.

dorianj commented 3 years ago

This is implemented, right? Please re-open if I missed something major; if we want to make enhancements, new smaller tickets would be good.

ccarterlandis commented 3 years ago

Yep, this is implemented - my internship at Gusto ended before before the PR got merged, so I totally forgot about this issue. Sorry about that! I think you are right to close it 🚀