airbnb / chronon

Chronon is a data platform for serving for AI/ML applications.
Apache License 2.0
673 stars 36 forks source link

Update analyzer for metadata extraction #752

Closed yuli-han closed 2 months ago

yuli-han commented 2 months ago

Summary

The metadata exporter is calling Analyzer to do data analysis before generating the metadata, which include:

  1. Validate input source table permission for the users who run the metadata export.
  2. Run a query on the input source table to get the output schema.

The process works well for users, but for metadata exporter since it is running by chronon team who may not have access on some of the source tables, the task will fail to generate the metadata for these group_by/joins.

To solve this issue, we decided to introduce a flag validateTablePermission in Analyzer for metadata exporter. The flag is set to be true by default. But it is set to be false for metadata exporter and it will avoid table permission validation and source table read to get rid of the permission issue.

Why / Goal

Test Plan

Checklist

Reviewers

@ezvz @nikhilsimha

ezvz commented 2 months ago

@hanyuli1995 small nit, but let's avoid internal slack links in the open source channel. It would be good to have context here that anyone can read and understand.