dbt-labs / spark-utils

Utility functions for dbt projects running on Spark
https://hub.getdbt.com/fishtown-analytics/spark_utils/latest/
Apache License 2.0
30 stars 15 forks source link

spark__get_relations_by_pattern doesn't seem to work as expected #36

Open VDFaller opened 10 months ago

VDFaller commented 10 months ago

spark_utils==0.3.0

Edit just realized the readme says that it doesn't have a shim for this, but it sure seems to.

Found this because I was trying to figure out this codegen issue

if I call

dbt run-operation generate_source --args '{schema_name: my_schema, database_name: my_database}' 

I don't get anything back

But if I call

dbt run-operation generate_source --args '{"schema_name": "my_schema", "database_name": "my_database", "table_pattern": ""*""}'

I do get stuff back, but I get the target.database instead my_database (they have the same schema). I think that's because database=None

I'm guessing I could do something similar to this in order to get the table_pattern problem. codegen defaults it to '%'

Possible Workaround

I'm able to get it working if I add

    {% set table_pattern = table_pattern|replace('%', '*') %}
    {%- call statement('switch_database', fetch_result=False) %}
        USE CATALOG {{database}}
    {%- endcall -%}

to the top of spark__get_relations_by_pattern

But I don't know what kind of impact that might truly have. It's only the path I'm coming from.

VDFaller commented 10 months ago

It's totally possible that this is half on codegen for defaulting table_pattern. And half spark_utils for the database thing.