Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
Originally created by @aaronsteers on 2022-05-18 15:46:29
The use case here is for users to create artifacts detailing the data structures and data profiling outputs associated with their project. Over time, Meltano could expand the ways in which this data is used.
Today, we have much of this data in .meltano internal artifacts (such as the singer catalog files) but we don't have any well defined means of working with these artifacts, we don't provide a wholistic diff/compare options, and we don't have a single place where a user could publish or search their schema definitions (for instance)
Schema catalog change detection
meltano catalog snapshot create all # Create artifacts for all taps' schema and for known dbt models' schemas
meltano catalog snapshot update all # Update artifacts
meltano catalog snapshot diff tap-gitlab --from=<old-path> --to=<new-path> # Print a diff of just the tap-gitlab artifacts
The user could presumably choose whether they want these artifacts committed to their repo or not.
Community plugin first approach
This is a big undertaking and would likely need to go through multiple iterations before a stable interface is landed on.
To allow faster iteration, this could in theory be built first as a utility plugin and published to the hub.
meltano add utility meltano-catalog-util
meltano run meltano-catalog-util:create all
meltano run meltano-catalog-util:update all
meltano run meltano-catalog-util:diff tap-gitlab
Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/3505
Originally created by @aaronsteers on 2022-05-18 15:46:29
The use case here is for users to create artifacts detailing the data structures and data profiling outputs associated with their project. Over time, Meltano could expand the ways in which this data is used.
Today, we have much of this data in
.meltano
internal artifacts (such as the singer catalog files) but we don't have any well defined means of working with these artifacts, we don't provide a wholistic diff/compare options, and we don't have a single place where a user could publish or search their schema definitions (for instance)Schema catalog change detection
The user could presumably choose whether they want these artifacts committed to their repo or not.
Community plugin first approach
This is a big undertaking and would likely need to go through multiple iterations before a stable interface is landed on.
To allow faster iteration, this could in theory be built first as a
utility
plugin and published to the hub.