airbnb / chronon

Chronon is a data platform for serving for AI/ML applications.
Apache License 2.0
717 stars 44 forks source link

Export the metadata team -> groupBy in metadata-uploader #770

Closed yuli-han closed 3 months ago

yuli-han commented 3 months ago

Summary

Add metadata export the for team -> chronon entity(group_by and join) in metadata-uploader. The Mussel data name CHRONON_ENTITY_BY_TEAM has been added by mussel team. https://airbnb.slack.com/archives/C01SS67TEUQ/p1718730788359319

group_bys/trust_v21 -> sample_group_by.v1,

Why / Goal

Test Plan

Sample log:

2024-06-24 20:54:22 INFO  MetadataStore:204 - Putting metadata for
dataset: CHRONON_ENTITY_BY_TEAM
key: group_bys/cs_ds
conf: List(group_bys/cs_ds/test.v1)

Local tests for joins: python3 ~/.local/bin/run.py --mode=metadata-upload --conf production/joins// --chronon-jar ~/test/chronon-embedded.jar | tee ~/test/metadata_upload.log

sample log:

2024-06-24 20:56:14 INFO  MetadataStore:204 - Putting metadata for
dataset: CHRONON_ENTITY_BY_TEAM
key: joins/knowledge_graph
conf: List(joins/knowledge_graph/test.v1)

Local test fetching:

2024-06-24 22:44:20 INFO Fetcher:206 - [test fetcher] Success(ArraySeq(group_bys/trust_test/test_v1))

Checklist

Reviewers

@haozhen-ding