Open pettersoderlund opened 1 year ago
@pettersoderlund just to confirm, this isn't solved by your PR we just merged, is it? I'm trying to understand why this is broken now. Thanks!
@mjirv no that's correct! - this problem is the same as stated in https://github.com/mjirv/dbt-datamocktool/issues/66#issuecomment-1486932968
The PR just merged just fixes the bug with compare columns.
nooo sorry -- this is not finished. Sorry if I was unclear. This feature is not fixed with the last PR
ah ok I misread you 😅
Just had a quick look at this. I haven't context loaded about how this used to work and what's changed, but if we were happy with it being a good solution for array columns we might be able to retrace our steps in getting it working again. We should also call out this as being a workaround in the readme.
The other option I can think of would be to create a new parameter like actual_output_field_mapping
which accepts a field mapped to a transformation eg.
models:
- name: stg_customers
tests:
- dbt_datamocktool.unit_test:
input_mapping:
source('jaffle_shop', 'raw_customers'): ref('dmt__raw_customers_1')
expected_output: ref('dmt__expected_stg_customers_1')
actual_output_field_mapping:
my_array_column: TO_JSON_STRING(my_array_column)
depends_on:
- ref('raw_customers')
And we'd apply that transformation when selecting fields in the actual output, just before we compare models. Have no idea how to do that elegantly though 😂
I figured we couldn't pass a parameter like array_columns=[...]
because the function to transform the column to a string will be different in each data warehouse unfortunately
Describe the feature
Support for testing columns with data type array in unit tests when working with Bigquery.
Describe alternatives you've considered
In our organisation we have used datamocktool to test arrays by specifying "compare_columns: - TO_JSON_STRING(array_col)". This has been working fine with 0.2.1 of dmt but was no longer working in >=0.2.2.
Additional context
Providing function-wrapped columns to compare columns should be useful in many databases. However I only have experience doing this in BigQuery.
The change of the handling of compare_columns was changed 0.2.2 where we went from a straight passing on of parameter compare columns to
dbt_utils.test_equality
to the new way of "finding excluded columns" to provide toaudit_helper.compare_relations
with parameter exclude_columns.Who will this benefit?
Anyone using arrays or structs in tables wanting to unit test also these columns.
Are you interested in contributing this feature?
I feel that i need some input to be able to get started on this.