Closed yuli-han closed 1 month ago
@better365 I update the dataset name from zipline to chronon now. For airbnb use case we also need to change the code in mussel otherwise the job will still fail. Will raise another PR in treehouse once this PR get stamped.
Summary
We are supporting metadata upload to k-v store for key-value pair key->conf right now. We want to add a general class metadata endpoint to support more potential use cases.
This PR is to add two general class MetadataEndPoint and MetadataDirWalker
MetadataEndPoint:
Defined with a extract function and an end point name. Extract function extracts the key-value pair from Conf(could be Join/GroupBy/StagingQuery) and file path(string). The name is the dataset name when we send the data to k-v store.
MetadataDirWalker:
Go through the directory to iterate over all the config files and generate k-v pair metadata based on the metadata end points provided.
The PR adds two metadata endpoint ZIPLINE_METADATA and ZIPLINE_METADATA_BY_TEAM
CHRONON_METADATA: key -> conf json in string format e.g : joins/team/team.example_join.v1 -> {...}
CHRONON_METADATA_BY_TEAM: type/team -> list of key in string format e.g : joins/team -> a, b, c
Why / Goal
Test Plan
Testing by running the metadata-upload and fetch command for a join. https://docs.google.com/document/d/1X7n_jskS7JyiiqVB3pStgPilg6luy23ho58twCHeip8/edit?usp=sharing
Checklist
Reviewers