Datavault-UK / automate-dv

A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
https://www.automate-dv.com
Apache License 2.0
509 stars 131 forks source link

[BUG] stage macro error with dry run in SQL Server #249

Open guaripete-github opened 1 month ago

guaripete-github commented 1 month ago

Describe the bug Executing dbt run --empty on a model with stage macro with a source relation as source_model fails. Model runs successfully if --empty is not invocated.

Environment

dbt version: 1.8.6 automate_dv version: 0.11.0 Database/Platform: sqlserver 1.8.4

Expected behavior The model to run.

Additional context

Issue seems to be related to

# sqlserver_relation.py

    def render_limited(self) -> str:
        rendered = self.render()
        if self.limit is None:
            return rendered
        elif self.limit == 0:
            return f"(select * from {rendered} where 1=0) {self._render_limited_alias()}"
        else:
            return f"(select TOP {self.limit} * from {rendered}) {self._render_limited_alias()}"

in https://github.com/dbt-msft/dbt-sqlserver/blob/ec30afe13b3bd487535f5342ef3be79d06c63edc/dbt/adapters/sqlserver/sqlserver_relation.py#L33

effectively replacing source() to return a sub-query instead of the expected relation object and breaking subsequent logic

{# stage.sql #}

{% if source_model is mapping and source_model is not none -%}

    {%- set source_name = source_model | first -%}
    {%- set source_table_name = source_model[source_name] -%}

    {%- set source_relation = source(source_name, source_table_name) -%}
    {%- set all_source_columns = automate_dv.source_columns(source_relation=source_relation) -%}

suggested change

{# stage.sql #}

{% if source_model is mapping and source_model is not none -%}

    {%- set source_name = source_model | first -%}
    {%- set source_table_name = source_model[source_name] -%}

    {%- set relation = source(source_name, source_table_name) -%}
    {% set source_relation = api.Relation.create(database=relation.database, schema=relation.schema, identifier=relation.identifier) %}
    {%- set all_source_columns = automate_dv.source_columns(source_relation=source_relation) -%}

...

WITH source_data AS (

    SELECT

    {{- "\n\n    " ~ automate_dv.print_list(list_to_print=all_source_columns, columns_to_escape=columns_to_escape) if all_source_columns else " *" }}

    FROM {{ relation }}
    {%- set last_cte = "source_data" %}
)

AB#5620

DVAlexHiggs commented 1 month ago

Hi.

This is likely related to the fact that AutomateDV does not yet support SQL server for dbt 1.8, as stated in our latest release notes and our dbt Compatibility page.

It is to be expected that there will be unpredictable behaviour due to this.

I will leave this issue open until we have upgraded to the latest SQL Server dbt adapter in the next AutomateDV release. Thanks for your patience.