Closed JCZuurmond closed 2 years ago
Totally a good one! And also on the roadmap.
The nice thing is that we can totally reuse the work you have done (we could consider going down the full API road, dbt has two APIs, the admin and the metdada API). The nice thing is the admin API just gives us access to the artifacts.
This means we can get that resolved pretty fast and then debate whether it might make more sense to talk to the metadata API (https://docs.getdbt.com/docs/dbt-cloud/dbt-cloud-api/metadata/metadata-overview) because maybe it's better for a variety of reasons (I don't know for now).
So I would say, let's make sure our mechanism and integration on the OSS approach is already working and we could tackle this. We'll need to get ourselves a cloud account probably to properly test this.
Thanks a lot for raising this proactively, I really like your thinking!
Is your feature request related to a problem? Please describe. We would like to ingest the dbt artifacts from the dbt cloud.
Describe the solution you'd like Ability to ingest the dbt artifacts from the dbt cloud.
We can use the dbt cloud API for this. Rough logic would be:
At minimum the user needs to provide the API token and the account id.
To consider: how to handle the paging of the API.
Additional context I think that the tricky part will be how to decide which runs to upload the artifacts for. If users have an advanced job structure that run sporadic - e.g. based on a trigger for when data is ingested into their warehouse - instead of on a (daily) schedule, it can become difficult to decide what artifacts to ingest. We should definitely check first if the alternative below works.
Alternative Test if a user can install
soda
in their job definition and add thesoda ingest dbt
command there. If so, then we do not need this issue per se.