Datavault-UK / automate-dv

A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
https://www.automate-dv.com
Apache License 2.0
513 stars 131 forks source link

[BUG] The Data is not getting copied correctly from raw tables to stage in Databricks env #184

Closed samanax closed 1 year ago

samanax commented 1 year ago

Describe the bug Using the following model to copy data from raw tables to staging:

{%- set yaml_metadata -%}
source_model: 'raw_account'
{%- endset -%}
{% set metadata_dict = fromyaml(yaml_metadata) %}
{% set source_model = metadata_dict['source_model'] %}
{{ dbtvault.stage(include_source_columns=true,
                  source_model=source_model,
                  derived_columns=derived_columns,
                  hashed_columns=hashed_columns,
                  ranked_columns=none) }}

Instead of having rows from the raw_table, it copies the name of the columns as data for the rows.

Environment

dbt version: 1.4.1 dbtvault version: 0.8.3 Database/Platform: Databricks 1.4.1

and also

dbt version: 1.3.2 dbtvault version: 0.8.3 Database/Platform: Databricks 1.3.2

To Reproduce Steps to reproduce the behavior:

  1. have an exmple raw data on the Databricks delta lake
  2. Create a raw delta table from the original data
  3. using the model, try to create a view or table for the staging
  4. query the staging view after running the dbt run command to see the data.

Expected behavior The data should be copied from the raw table to the destination view or table as is in the raw table.

Screenshots The stage view with wrong data: Screenshot 2023-02-13 at 10 21 59

The data in the raw table: Screenshot 2023-02-13 at 10 24 04

Log files If applicable, provide dbt log files which include the problem.

Additional context Add any other context about the problem here.

DVAlexHiggs commented 1 year ago

Closing as this was due to using an old version of dbtvault. Fixed after 0.84.