Datavault-UK / automate-dv

A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
https://www.automate-dv.com
Apache License 2.0
478 stars 114 forks source link

[BUG] Incremental updates on satellites fail with case sensitive SQL Server collations #209

Closed geoffwi40 closed 4 months ago

geoffwi40 commented 9 months ago

Describe the bug When using case sensitive collations, automate_dv.sat incremental updates fail due to the "latest_records" being declared as lower case and then used as upper case in the left join for the records to insert. This results in an invalid object error (on synapse)

Environment dbt version: dbt-core 1.3.5, dbt-synapse 1.3.2 automate-dv version: 0.10.1 Database/Platform: Azure Synapse Dedicated SQL Pool, Collation = SQL_Latin1_General_CP1_CS_AS (technically not supported but works fine as it's inherited from SQL Server)

To Reproduce Steps to reproduce the behavior:

  1. Create a target with a case sensitive collation
  2. create a satellite and then run DBT RUN twice to get incremental load

Screenshots image

Expected behavior Should have been lower case in the join or at least consistent

Additional context

Compiled SQL fragment

latest_records AS ( SELECT b.account_links_hk, b.account_links_hashdiff, b.accl_seqno, b.accl_last_change_date, b.accl_subacc_no, b.accl_last_change_by, b.accl_soc_seqno, b.accl_holder_pos, b.accl_soc_control, b.accl_dispensing_notice, b.accl_brn_code, b.accl_account_no, b.accl_old_no, b.accl_extract_required, b.accl_gact_type, b.accl_cust_seqno, b.accl_cust_usage, b.account_links_nk, b.account_and_subacc_no_nk, b.customer_nk, b.end_date, b.deleted, b.active, b.account_and_subacc_no_hk, b.customer_hk, b.start_date, b.load_datetime, b.source FROM (......

records_to_insert AS ( SELECT frin.account_links_hk, frin.account_links_hashdiff, frin.accl_seqno, frin.accl_last_change_date, frin.accl_subacc_no, frin.accl_last_change_by, frin.accl_soc_seqno, frin.accl_holder_pos, frin.accl_soc_control, frin.accl_dispensing_notice, frin.accl_brn_code, frin.accl_account_no, frin.accl_old_no, frin.accl_extract_required, frin.accl_gact_type, frin.accl_cust_seqno, frin.accl_cust_usage, frin.account_links_nk, frin.account_and_subacc_no_nk, frin.customer_nk, frin.end_date, frin.deleted, frin.active, frin.account_and_subacc_no_hk, frin.customer_hk, frin.start_date, frin.load_datetime, frin.source FROM first_record_in_set AS frin LEFT JOIN LATEST_RECORDS lr ON lr.account_links_hk = frin.account_links_hk AND lr.account_links_hashdiff = frin.account_links_hashdiff WHERE lr.account_links_hashdiff IS NULL UNION SELECT ........etc

DVAlexHiggs commented 4 months ago

Fixed in v0.10.2 😄 Thanks for your patience for release of this! Please let us know if you experience any issues by responding here or opening a new issue.