A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
Describe the bug
In the last few days we have identified a number of bugs around how NULLs are handled, reported by a client. We do have tests for these in our test suite which were passing, but due to a very subtle bug in our test suite, the tests were passing when they should not have been. The test suite has now been fixed and we have fixed the NULL issues. We will be releasing a public fix this for this shortly (this week or early next week). You may grab the latest version of the code (unreleased) from the following branch of our development repository:
Alternatively, as mentioned, there will be a short wait for the public release.
The effects of these bugs is that records are being loaded in to structures even though their respective PK or FKs contain NULL values. This is not desired behaviour as per Data Vault 2.0 standards: If we have NULL records, these mean nothing to the business.
Standard practise is to replace NULLs with tokens or codes to signify whether they are mandatory and missing, optional and missing, or just missing (typically 0000..., 1111...., 2222....). We will be developing documentation on these practises and how dbtvault handles NULLs to allow users to make more informed decisions and understand how and why NULLs are handled as they are.
Please reach out to us on Slack or via email if you have been affected by this bug in production, and we can offer guidance on fixing this. In most cases this is not a significant cause for concern, and Data Vault can support these fixes without much headache.
Describe the bug In the last few days we have identified a number of bugs around how NULLs are handled, reported by a client. We do have tests for these in our test suite which were passing, but due to a very subtle bug in our test suite, the tests were passing when they should not have been. The test suite has now been fixed and we have fixed the NULL issues. We will be releasing a public fix this for this shortly (this week or early next week). You may grab the latest version of the code (unreleased) from the following branch of our development repository:
https://github.com/Datavault-UK/dbtvault-dev/tree/int/0.7.4
Alternatively, as mentioned, there will be a short wait for the public release.
The effects of these bugs is that records are being loaded in to structures even though their respective PK or FKs contain NULL values. This is not desired behaviour as per Data Vault 2.0 standards: If we have NULL records, these mean nothing to the business.
Standard practise is to replace NULLs with tokens or codes to signify whether they are mandatory and missing, optional and missing, or just missing (typically 0000..., 1111...., 2222....). We will be developing documentation on these practises and how dbtvault handles NULLs to allow users to make more informed decisions and understand how and why NULLs are handled as they are.
Please reach out to us on Slack or via email if you have been affected by this bug in production, and we can offer guidance on fixing this. In most cases this is not a significant cause for concern, and Data Vault can support these fixes without much headache.
Versions
dbt: v0.19.0 dbtvault: v0.7.3
Affected structures