sodadata / soda-sql

Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html
https://docs.soda.io/
Apache License 2.0
59 stars 16 forks source link

[BUG] ingesting dbt results from dbt Cloud ingest null result values --problem with dbt Cloud itself #193

Closed bastienboutonnet closed 2 years ago

bastienboutonnet commented 2 years ago

I noticed that when we ingest from a dbt Cloud obtained run_result.json the number of failures as well as the status of the test are always null.

This causes the tests to look as if they are failing on Soda Cloud since we consider null values to be problematic (and rightly so).

For now, it looks like the issue is with dbt Cloud itself as the nulls are in the raw artifacts we get and not our transformations.

I have reached out to dbt cloud support and will see what they say.

bastienboutonnet commented 2 years ago

It turns out the API requires the following:

What I'll do:

I believe this is better than silently ingesting garbage. Since users might move steps around in their dbt jobs but forget to update a potentially scheduled soda ingest it could be that they ingest a step that now has nulls

What do you think @vijaykiran ?

vijaykiran commented 2 years ago

Sounds good to me @bastienboutonnet if there is a bug/reference that we should follow up on dbt, can you please add it as TODO in the code at relevant place?

bastienboutonnet commented 2 years ago

No there isn't. I suspect it's a case of "It's on the roadmap TM" since it's not on the OSS side, there's no GH issue and the support person didn't point me to one.

That being said they're working on a v4 admin API, maybe it'll be corrected in there (https://docs.getdbt.com/docs/dbt-cloud/dbt-cloud-api/admin-cloud-api)

bastienboutonnet commented 2 years ago

Unfortunately, although the query works with a / at the end, it doesn't actually return the results from the correct step.

This has been escalated to a support engineer and they'll be reaching out.

For the moment, all we can do is add to the docs a warning and advise people to set up their jobs so that it ends on a dbt build or dbt test step (i.e. not generating the docs automatically) and put in place the check that not all run results failures are null.

When we get a resolution about the step behaviour we'll address it in a follow up PR.

bastienboutonnet commented 2 years ago

dbt Cloud support confirmed that the step should work out as expected, with no slash at the end.

They could not see some of the failed queries I made so we're putting it down to a network issue of some kind.

I'll be adding the steps functionality in the coming days or we can decide to plan it in the next sprint.