[X] I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
When running dbt run --empty with a snowpark model that includes dbt.ref, it falls over.
Here is a minimal snowpark model:
import pandas as pd
def model(dbt, session):
data = dbt.ref("stg_ftv__tvt_arena")
return pd.DataFrame({"x":[1,2,3], "y":['a','b','c']})
and dbt run --empty fails with
16:08:50 Database Error in model pymodel (models/pymodel.py)
100357 (P0000): Python Interpreter Error:
Traceback (most recent call last):
File "_udf_code.py", line 97, in main
File "_udf_code.py", line 12, in model
File "_udf_code.py", line 61, in <lambda>
File "_udf_code.py", line 28, in ref
File "/usr/lib/python_udf/8c636e84b1538a317283a9cebb185cc59b812b3e7c12390bf024519dc9298cf3/lib/python3.8/site-packages/snowflake/snowpark/session.py", line 1857, in table
validate_object_name(name)
File "/usr/lib/python_udf/8c636e84b1538a317283a9cebb185cc59b812b3e7c12390bf024519dc9298cf3/lib/python3.8/site-packages/snowflake/snowpark/_internal/utils.py", line 218, in validate_object_name
raise SnowparkClientExceptionMessages.GENERAL_INVALID_OBJECT_NAME(name)
snowflake.snowpark.exceptions.SnowparkInvalidObjectNameException: (1500): The object name '(select * from DATASCIENCE_DB.SANDBOX_S.stg_ftv__tvt_arena where false limit 0)' is invalid.
in function PYMODEL__DBT_SP with handler main
compiled Code at target/run/dbt_ftv/models/pymodel.py
16:08:50
Expected Behavior
that it should run through all dbt models quickly without failing
Thanks for reporting this @canderson-luminatedata !
I got the same error as you with the following files and commands:
### Reprex
Files:
`models/some_sql_model.sql`
```sql
select 1 as id
```
`models/snowpark_model.py`
```python
import pandas as pd
def model(dbt, session):
data = dbt.ref("some_sql_model")
return pd.DataFrame({"x":[1,2,3], "y":['a','b','c']})
```
Commands:
```
dbt run --empty
```
Output:
```
(dbt_1.8) $ dbt run --empty
23:44:57 Running with dbt=1.8.3
23:44:58 Registered adapter: snowflake=1.8.3
23:44:58 Found 2 models, 444 macros
23:44:58
23:45:01 Concurrency: 10 threads (target='snowflake')
23:45:01
23:45:01 1 of 2 START sql view model dbt_dbeatty.some_model ............................. [RUN]
23:45:02 1 of 2 OK created sql view model dbt_dbeatty.some_model ........................ [SUCCESS 1 in 0.96s]
23:45:02 2 of 2 START python table model dbt_dbeatty.snowpark_model ..................... [RUN]
23:45:05 2 of 2 ERROR creating python table model dbt_dbeatty.snowpark_model ............ [ERROR in 3.31s]
23:45:05
23:45:05 Finished running 1 view model, 1 table model in 0 hours 0 minutes and 6.62 seconds (6.62s).
23:45:05
23:45:05 Completed with 1 error and 0 warnings:
23:45:05
23:45:05 Database Error in model snowpark_model (models/snowpark_model.py)
100357 (P0000): Python Interpreter Error:
Traceback (most recent call last):
File "_udf_code.py", line 95, in main
File "_udf_code.py", line 10, in model
File "_udf_code.py", line 59, in
File "_udf_code.py", line 26, in ref
File "/usr/lib/python_udf/8c636e84b1538a317283a9cebb185cc59b812b3e7c12390bf024519dc9298cf3/lib/python3.8/site-packages/snowflake/snowpark/session.py", line 1857, in table
validate_object_name(name)
File "/usr/lib/python_udf/8c636e84b1538a317283a9cebb185cc59b812b3e7c12390bf024519dc9298cf3/lib/python3.8/site-packages/snowflake/snowpark/_internal/utils.py", line 218, in validate_object_name
raise SnowparkClientExceptionMessages.GENERAL_INVALID_OBJECT_NAME(name)
snowflake.snowpark.exceptions.SnowparkInvalidObjectNameException: (1500): The object name '(select * from analytics_dev.dbt_dbeatty.some_model where false limit 0)' is invalid.
in function SNOWPARK_MODEL__DBT_SP with handler main
compiled Code at target/run/my_project/models/snowpark_model.py
23:45:05
23:45:05 Done. PASS=1 WARN=0 ERROR=1 SKIP=0 TOTAL=2
```
`target/run/my_project/models/snowpark_model.py`
```
import pandas as pd
def model(dbt, session):
data = dbt.ref("some_model")
return pd.DataFrame({"x":[1,2,3], "y":['a','b','c']})
# This part is user provided model code
# you will need to copy the next section to run the code
# COMMAND ----------
# this part is dbt logic for get ref work, do not modify
def ref(*args, **kwargs):
refs = {"some_model": "(select * from analytics_dev.dbt_dbeatty.some_model where false limit 0)"}
key = '.'.join(args)
version = kwargs.get("v") or kwargs.get("version")
if version:
key += f".v{version}"
dbt_load_df_function = kwargs.get("dbt_load_df_function")
return dbt_load_df_function(refs[key])
def source(*args, dbt_load_df_function):
sources = {}
key = '.'.join(args)
return dbt_load_df_function(sources[key])
config_dict = {}
class config:
def __init__(self, *args, **kwargs):
pass
@staticmethod
def get(key, default=None):
return config_dict.get(key, default)
class this:
"""dbt.this() or dbt.this.identifier"""
database = "analytics_dev"
schema = "dbt_dbeatty"
identifier = "snowpark_model"
def __repr__(self):
return 'analytics_dev.dbt_dbeatty.snowpark_model'
class dbtObj:
def __init__(self, load_df_function) -> None:
self.source = lambda *args: source(*args, dbt_load_df_function=load_df_function)
self.ref = lambda *args, **kwargs: ref(*args, **kwargs, dbt_load_df_function=load_df_function)
self.config = config
self.this = this()
self.is_incremental = False
# COMMAND ----------
# To run this in snowsight, you need to select entry point to be main
# And you may have to modify the return type to text to get the result back
# def main(session):
# dbt = dbtObj(session.table)
# df = model(dbt, session)
# return df.collect()
# to run this in local notebook, you need to create a session following examples https://github.com/Snowflake-Labs/sfguide-getting-started-snowpark-python
# then you can do the following to run model
# dbt = dbtObj(session.table)
# df = model(dbt, session)
def materialize(session, df, target_relation):
# make sure pandas exists
import importlib.util
package_name = 'pandas'
if importlib.util.find_spec(package_name):
import pandas
if isinstance(df, pandas.core.frame.DataFrame):
session.use_database(target_relation.database)
session.use_schema(target_relation.schema)
# session.write_pandas does not have overwrite function
df = session.createDataFrame(df)
df.write.mode("overwrite").save_as_table('analytics_dev.dbt_dbeatty.snowpark_model', table_type='transient')
def main(session):
dbt = dbtObj(session.table)
df = model(dbt, session)
materialize(session, df, dbt.this)
return "OK"
```
Is this a new bug in dbt-core?
Current Behavior
When running
dbt run --empty
with a snowpark model that includesdbt.ref
, it falls over.Here is a minimal snowpark model:
and
dbt run --empty
fails withExpected Behavior
that it should run through all dbt models quickly without failing
Steps To Reproduce
with
and minimal model
Relevant log output
It is complaining about the following autogenerated code:
Environment
Which database adapter are you using with dbt?
snowflake
Additional Context
No response