- Adds new macro, `get_json_columns_in_relation` to return a list of columns that are of type JSON (on BigQuery only, as we haven't rolled out JSON support in other Fivetran destinations).
- Includes call to `get_json_columns_in_relation` in `fill_staging_columns`, so that each JSON field gets converted to a json-formatted STRING. This is necessary because only some connectors have JSONs currently and downstream json functions do not work seamlessly with the actual JSON-type columns.
- `fill_staging_columns` was chosen as this is called in every (most?) source package model. therefore, we don't have to mass update a bunch of packages. however, we want this to be right and not break anything!
**If this PR introduces a new macro, how did you test the new macro?**
get_json_columns_in_relation
I first tested get_json_columns_in_relation by directly calling it in a local test model.
the model i ran it against was just a select * from Joe's seed file
tiktok_ads_source_integration_tests_2.tiktok_adgroup_history_data, which contains a JSON field called AGE_GROUPS.
When running, the test model, i was able to get the macro to return and output ['AGE_GROUPS']
fill_staging_columns
I tested the incorporation of get_json_columns_in_relation into fill_staging_columns by running the tiktok ads source package (specifically from the integration_tests folder) locally.
With tiktok_adgroup_history_data as the source, the model where trouble would pop up was stg_tiktok_ads__ad_group_history, specifically on the line where we try to coalesce age_groups and an actual-string field, age.
With my additions, the fields CTE in stg_tiktok_ads__ad_group_history compiles to the following
fields as (
select
action_days
as
action_days
,
adgroup_id
as
adgroup_id
# ................ cutting stuff out that's not relevant .......................... #
,
cast(null as string) as
age
,
TO_JSON_STRING(
age_groups
)
as
age_groups
,
languages
as
languages
, cast('' as string) as source_relation
from base
),
And so the coalesce of age_groups and age in the final CTE does not cause an error and the model runs successfully.
and the data looks OK in bigquery
If this PR introduces a modification to an existing macro, which packages is the macro currently present in and what steps were taken to test compatibility across packages?
Honestly we should test this with other packages as well -- ones with JSONs and not -- given the far-reaching nature of this macro change. perhaps we could have some customers try this out
Did you update the README to reflect the macro addition/modifications?
if we want to move forward with this PR we should add a variable that is a boolean, telling us whether or not to convert back to strings or maintain the json type
What change does this PR introduce?
- Adds new macro, `get_json_columns_in_relation` to return a list of columns that are of type JSON (on BigQuery only, as we haven't rolled out JSON support in other Fivetran destinations). - Includes call to `get_json_columns_in_relation` in `fill_staging_columns`, so that each JSON field gets converted to a json-formatted STRING. This is necessary because only some connectors have JSONs currently and downstream json functions do not work seamlessly with the actual JSON-type columns. - `fill_staging_columns` was chosen as this is called in every (most?) source package model. therefore, we don't have to mass update a bunch of packages. however, we want this to be right and not break anything! **If this PR introduces a new macro, how did you test the new macro?**get_json_columns_in_relation
I first tested
get_json_columns_in_relation
by directly calling it in a local test model.the model i ran it against was just a select * from Joe's seed file
tiktok_ads_source_integration_tests_2.tiktok_adgroup_history_data
, which contains a JSON field calledAGE_GROUPS
.When running, the test model, i was able to get the macro to return and output
['AGE_GROUPS']
fill_staging_columns
I tested the incorporation of
get_json_columns_in_relation
intofill_staging_columns
by running the tiktok ads source package (specifically from theintegration_tests
folder) locally.With
tiktok_adgroup_history_data
as the source, the model where trouble would pop up wasstg_tiktok_ads__ad_group_history
, specifically on the line where we try to coalesceage_groups
and an actual-string field,age
.With my additions, the fields CTE in
stg_tiktok_ads__ad_group_history
compiles to the followingAnd so the coalesce of
age_groups
andage
in the final CTE does not cause an error and the model runs successfully.and the data looks OK in bigquery
If this PR introduces a modification to an existing macro, which packages is the macro currently present in and what steps were taken to test compatibility across packages?
Honestly we should test this with other packages as well -- ones with JSONs and not -- given the far-reaching nature of this macro change. perhaps we could have some customers try this out
Did you update the README to reflect the macro addition/modifications?