Closed MKloseINF closed 3 months ago
Hi @MKloseINF and thanks for reaching out!
We use the Block Hashdiff for Multiactivity, meaning in your example, all participants of one multi-active group have the same hashdiff:
source data first load: HK |HD |Line | Quantity | LDTS ABCD | 5678 | 1 | 5 | 14.8.2024 ABCD | 5678 | 2 | -10 | 14.8.2024
MA sat after first load: HK |HD |Line | Quantity | LDTS ABCD | 5678 | 1 | 5 | 14.8.2024 ABCD | 5678 | 2 | -10 | 14.8.2024
Then the second load would not be seen as a delta, and no insert happens. For more information about the block-hashdiff, generated by LISTAGG(), see our blog article here: https://www.scalefree.com/blog/data-vault/using-multi-active-satellites-the-correct-way-2-2/
Let me know if this helped!
Hi @tkirschke,
unfortunately that won't help and macros/tables/snowflake/ma_sat_v0.sql didn't use LISTAGG function.
When looking at the Jinja macro for incremental the row number = 1 throws you only back one row (random) but ignores in above exampel LINE 1 and 2 (MA key). Records to insert part of sql statement identifies then 1 column as new.
{# Get the latest record for each parent hashkey in existing sat, if incremental. #} {%- if is_incremental() %} latest_entries_in_sat AS (
SELECT
{{ parent_hashkey }},
{{ ns.hdiff_alias }}
FROM
{{ this }}
QUALIFY ROW_NUMBER() OVER(PARTITION BY {{ parent_hashkey|lower }} ORDER BY {{ src_ldts }} DESC) = 1
), {%- endif %}
Hi @MKloseINF ,
LISTAGG() or STRINGAGG() is used in the Staging model, as soon as multi-activity is configured. Can you confirm, that your stage model has the multi active config properly defined, as seen in Example 3 in the wiki: https://github.com/ScalefreeCOM/datavault4dbt/wiki/Staging#example-3
Please send the stage model and Ma sat v0 model, so we can help you better.
yes I checked and everythings looks good from my point of view.
Stage: `{{ config(materialized="table") }}
{%- set yaml_metadata -%} source_model: 'DS_AFDPNLT0011' ldts: 'CURRENT_TIMESTAMP' rsrc: '!DS_AFDPNLT0011' hashed_columns: hk_faabstnk_h:
{%- set metadata_dict = fromyaml(yaml_metadata) -%}
{%- set source_model = metadata_dict["source_model"] -%} {%- set ldts = metadata_dict["ldts"] -%} {%- set rsrc = metadata_dict["rsrc"] -%} {%- set hashed_columns = metadata_dict["hashed_columns"] -%} {%- set derived_columns = metadata_dict["derived_columns"] -%} {%- set prejoined_columns = metadata_dict["prejoined_columns"] -%} {%- set missing_columns = metadata_dict["missing_columns"] -%}
{{ datavault4dbt.stage( source_model=source_model, ldts=ldts, rsrc=rsrc, hashed_columns=hashed_columns, derived_columns=derived_columns, prejoined_columns=prejoined_columns, missing_columns=missing_columns, ) }}`
MA Sat: `{{ config(materialized="incremental") }}
{%- set yaml_metadata -%} source_model: 'DS_STG_AFDPNLT0011' parent_hashkey: 'hk_AFDPNLT001_l' src_hashdiff: 'hd_AFDPNLT0011_s' src_ma_key:
{%- set metadata_dict = fromyaml(yaml_metadata) -%}
{{ datavault4dbt.ma_sat_v0( source_model=metadata_dict["source_model"], parent_hashkey=metadata_dict["parent_hashkey"], src_hashdiff=metadata_dict["src_hashdiff"], src_ma_key=metadata_dict["src_ma_key"], src_payload=metadata_dict["src_payload"], ) }} `
Hi @MKloseINF
The Multi Active Config seems fine, but it's not passed into the stage macro:
{%- set metadata_dict = fromyaml(yaml_metadata) -%}
{%- set source_model = metadata_dict["source_model"] -%}
{%- set ldts = metadata_dict["ldts"] -%}
{%- set rsrc = metadata_dict["rsrc"] -%}
{%- set hashed_columns = metadata_dict["hashed_columns"] -%}
{%- set derived_columns = metadata_dict["derived_columns"] -%}
{%- set prejoined_columns = metadata_dict["prejoined_columns"] -%}
{%- set missing_columns = metadata_dict["missing_columns"] -%}
{%- set multi_active_config = metadata_dict["multi_active_config"] -%}
{{
datavault4dbt.stage(
source_model=source_model,
ldts=ldts,
rsrc=rsrc,
hashed_columns=hashed_columns,
derived_columns=derived_columns,
prejoined_columns=prejoined_columns,
missing_columns=missing_columns,
multi_active_config=multi_active_config
)
}}`
Let me know if this solves the issue.
indeed you're right. we took the "wrong" example code which is missing multi active config.
I added missing lines but now generated code throws errors, see attached file console_output.txt
does main_hashkey_column work with link hash key? Looking at example 3 there is hk_test used which is not specified in whole definition code
Hi @MKloseINF,
the error in your compiled code is caused by an additional comma, this was fixed here. Which version of datavault4dbt are you using?
Kind regards Theo
Hi @tkiehn
we were on version 1.3.1....so I did the upgrade to 1.3.3. Additional comma error resolved, but now new error invalid identifier 'HK_AFDPNLT001_L' occurs. I can point it out where it comes from. We are using a MA satellite for link. The link hash key is calculated after the hash key, but we need the link hash key (HK_AFDPNLT001_L) in hash diff calculation which fails due to later calculation.
But it works changing model spec order...first link hash key, then hashdiff key
Thanks a lot for your support
Thanks a lot
MA satellite only uses hash key and hash diff to identify latest entries in sat. Multi active attribute(s) are ignored in partition clause. Using row number to qualify results in exactly one hash + hash diff key. But if you've multi active for same ldts in your satellite only one is identified and a similar row is loaded.
Example: Line is MA attribute
source data first load: HK |HD |Line | Quantity | LDTS ABCD | 5678 | 1 | 5 | 14.8.2024 ABCD | 3456 | 2 | -10 | 14.8.2024
MA sat after first load: HK |HD |Line | Quantity | LDTS ABCD | 5678 | 1 | 5 | 14.8.2024 ABCD | 3456 | 2 | -10 | 14.8.2024
source data second load (maybe same data because of no increment from source): HK |HD |Line | Quantity | LDTS ABCD | 5678 | 1 | 5 | 15.8.2024 ABCD | 3456 | 2 | -10 | 15.8.2024
MA sat after second load: HK |HD |Line | Quantity | LDTS ABCD | 5678 | 1 | 5 | 14.8.2024 ABCD | 3456 | 2 | -10 | 14.8.2024 ABCD | 5678 | 1 | 5 | 15.8.2024
Which line for second load will be identified as diff is random because of ROW function.