dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.25k stars 1.53k forks source link

When providing BOTH fixture and rows to unit test, rows silently ignored - instead throw an error #10357

Open megetron3 opened 4 days ago

megetron3 commented 4 days ago

Is this a new bug in dbt-core?

Current Behavior

When attempting to pass parameters to a fixture input so that the fixture can receive some rows input, the fixture seems to ignore the rows. Here is the usage:

unit_tests:
  - name: test_my_model
    model: my_model
    given:
      - input: ref('my_model_a')
        format: sql
        fixture: model_sql_file
        rows: [{param1: 1, param2: 2}]

Despite the rows being defined, the fixture does not recognize them and there are no error warnings.

Expected Behavior

The fixture should be able to receive the rows input and based on this input, the Jinja should generate the relevant input. The fixture's SQL model should start rendering with parameters.

Steps To Reproduce

Steps to Reproduce

  1. Define the unit_tests with the given parameters and rows as shown above.
  2. Run the test.
  3. Observe that the fixture ignores the rows input.

Relevant log output

No response

Environment

- OS:
- Python:
- dbt:

Which database adapter are you using with dbt?

No response

Additional Context

Workaround A temporary solution was found by modifying the compile_and_execute function in the dbt-core base.py file as follows:

with collect_timing_info("execute", ctx.timing.append):
    result = self.run(ctx.node, manifest, {"source_path": "sad"})
    ctx.node = result.node

This change resolves the issue and the fixture's SQL model starts rendering with parameters. However, a permanent solution in the form of a pull request has not been successful yet.

graciegoheen commented 3 days ago

Hi! When defining your input data using format: sql you can either use an inline SQL query for rows or provide the name of a fixture. We don't currently support you providing both at the same time - docs here. Sounds like we may want to throw an error in this case, so that this doesn't "fail silently"? Will update this issue accordingly.

Additionally, Jinja is currently unsupported in SQL fixtures for unit tests. Similar issue here.

megetron3 commented 2 days ago

Hi! When defining your input data using format: sql you can either use an inline SQL query for rows or provide the name of a fixture. We don't currently support you providing both at the same time - docs here. Sounds like we may want to throw an error in this case, so that this doesn't "fail silently"? Will update this issue accordingly.

Why should we limit this option when it has the potential to enhance the customizability of unit tests, as a testing framework should? Given that we already process a Jinja template with .sql (which is already supported by dbt unit-tests), it seems logical to also process the Jinja template with arguments. This could open up a new realm of flexibility. I believe this could be a significant shift towards customizable inputs, something the community has been seeking since the initial launch of 1.8.0.

Additionally, Jinja is currently unsupported in SQL fixtures for unit tests. Similar issue here.

I believe it would be advantageous to add support for Jinja. The ability to execute something like this would be highly beneficial:

given:
      - input: ref('stg_tpch_customers')
        format: sql
        rows: |
          select customer_key from {{ target.database ~ '.' ~ target.schema }}.stg_tpch_customers where is_testing_row = true

This would undoubtedly increase the flexibility of input.

However, please note that these are separate issues.

Just to ensure we're on the same page and I've understood correctly, there are two options:

Option 1: Using a SQL Jinja file, where jinja is already supported by dbt. I would ask to add support to receive arguments (through rows or params key).

Option 2: Using inline SQL (without a file), in which case we don't currently support Jinja - so adding such support might be more complex. But in my opinion, the first option should be relatively straightforward to implement.

However, please note that I attempted to create a pull request to implement option 1, but I have not been successful so far.