Closed kellyldougan closed 1 year ago
Found this after a quick search. Might be a bug on the connector side(although I am not too sure what is the connector version being used here in stored procedure). Will dig more when get to it.
Good find, Chenyu!
It sounds like this could be related to a bug in snowflake-connector-python
, which will be fixed in a release coming in Jan 2023. We can try again then, and see if the issue persists.
Will close out once we confirm that Snowflake has fixed this on their end https://github.com/snowflakedb/snowflake-connector-python/issues/1034#issuecomment-1266583831
bumping this from another community report: https://getdbt.slack.com/archives/C03QUA7DWCW/p1673461860389249
@jtcohen6 looks like the underlying Snowflake issues is fixed, could we fix here as well?
+1 when/in what version can we expect this fix to get merged in?
It looks like the linked issue/fix on Snowflake's side (https://github.com/snowflakedb/snowflake-connector-python/pull/1274) will be included in a new version of dbt-snowflake
, either v2.9 released in December, or a new release coming later this month.
We still need to test compatibility with that new version in dbt-snowflake
. I'm going to transfer this issue over to the dbt-snowflake
repo
I'm getting the same error when running my project on dbt cloud. It works perfectly when building from the IDE but I get this error when I try running it in production.
The SPs from the build are left in my production schema and I tried calling them manually from the snowflake UI and that works, if I've first selected the correct database and schema, maybe that it also a hint about what's going wrong?
Seeing this error as well since moving to custom schemas. Any workarounds?
Using dbt-snowflake: 1.3.0 Core: 1.3.2
Edit: Same error with the following versions: dbt-snowflake: 1.4.0 Core: 1.4.1
Issue still persists with dbt-snowflake 1.4.1. Any update this issue? It was said here that this was to be fixed in 1.4.1.
The (hopeful) fix for this was included in snowflake-connector-python==3.0.0
just a few weeks ago (after the dbt Core v1.4 release): https://community.snowflake.com/s/article/Snowflake-Connector-for-Python-Release-Notes
Since it's a major version bump, there's some additional due diligence warranted before including the upgrade in a v1.4.x patch release.
Seems that the problem still exists in dbt-snowflake v1.4.1.
Hey all, we're lining this up for our next sprint. So hopefully we should have something out in the next 2 weeks.
I believe #476 resolves this issue. However, that just pushes into main
. We'll need to backport this to released versions (e.g. 1.3.latest
) before it shows up in dbt Cloud. Just calling that out here in case merging #476 closes this issue via automation.
Re-opening; this was closed due to automation. Once we've confirmed the changes in the backport we can close this.
@kellyldougan This appears to be working for me now. Can you please try your example again and let me know if your issue is resolved? Thanks!
I tried my example both in dbt cloud and locally with dbt core, and I am still seeing the same issue. With dbt cloud, I ran using versions 1.3 and 1.4, is there a different version this should be working with?
@kellyldougan Apologies, I was mistaken. I thought we had already released the update to dbt Cloud, but we had not. However, that also means I haven't correctly replicated your scenario since it works in dbt Cloud for me. Since I missed something, I'll read back what I did and maybe we can compare to see the difference. Here's what I have in my project:
/macros
generate_schema_name.sql
generate_schema_name_for_env.sql
/models
/datamarts
/dim
my_other_model.sql
my_python_model.py
schema.yml
.gitignore
README.md
dbt_project.yml
All other files are empty. .gitignore
and README.md
are irrelevant here. The rest of the file's contents are as follows:
generate_schema_name.sql
{% macro generate_schema_name(custom_schema_name, node) -%}
{{ generate_schema_name_for_env(custom_schema_name, node) }}
{%- endmacro %}
generate_schema_name_for_env.sql
{% macro generate_schema_name_for_env(custom_schema_name, node) -%}
{%- set default_schema = target.schema -%}
{%- if target.name == "prod" and custom_schema_name is not none -%}
{{ custom_schema_name | trim }}
{%- else -%}
{{ default_schema }}_{{ custom_schema_name | trim }}
{%- endif -%}
{%- endmacro %}
Note: I did update line 5 to be the equality operator (
==
) instead of the assignment operator (=
).
my_other_model.sql
select 1 as my_column
my_python_model.py
def model(dbt, session):
dbt.config(materialized="table")
df = dbt.ref("my_other_model").to_pandas()
return df
schema.yml
version: 2
models:
- name: my_other_model
description: "A dummy model to be fed into a python model"
columns:
- name: my_column
description: "The only column in the table, with one record whose value is the integer 1"
tests:
- not_null
- name: my_python_model
description: "The python model that copies the sql model"
columns:
- name: my_column
description: "A copy of my_other_model.my_column"
tests:
- not_null
dbt_project.yml
name: 'dbt_snowflake_393'
version: '1.3.0'
config-version: 2
profile: 'default'
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]
target-path: "target"
clean-targets:
- "target"
- "dbt_packages"
models:
dbt_snowflake_393:
datamarts:
dim:
schema: SNOWFLAKE_393
I have two environments:
Development
Name: Development
Environment Type: Development
dbt Version: 1.3
Default to a custom branch: False
Production
Name: Production
Environment Type: Deployment
dbt Version: 1.3
Only run on a custom branch: main
...
Schema: DBT_MIKEALFARE
I have one job:
Deploy to Production
Job Name: Deploy to Production
Environment: Production
dbt Version: Inherited from Production (1.3)
Target Name: prod
Threads: 4
Environment Variables: empty
...
Commands: dbt build; dbt run;
If I run in the console, I get the following relations, each with a single record {MY_COLUMN: 1}:
DBT_MIKEALFARE_SNOWFLAKE_393
Tables
MY_PYTHON_MODEL
Views
MY_OTHER_MODEL
If I run the job above, Deploy to Production
, I get the following relations, each with a single record {MY_COLUMN: 1}:
SNOWFLAKE_393
Tables
MY_PYTHON_MODEL
Views
MY_OTHER_MODEL
What I did notice while building this is that I would get a schema of DBT_MIKEALFARE_None
if I didn't line up the path in dbt_project.yml
(datamarts
> dim
) with the models
directory structure (models/datamarts/dim
). Is this also what you were seeing?
Hi I'm using dbt-snowflake 1.3.1 dbt-core 1.4.1 snowflake-connector-python 3.0.1 And I am still having this issue. Is there a new work around that needs to be implemented?
Hi @mikealfare, this is still persisting for another customer that had reported this as well. They've shared the below:
"updated to dbt 1.4.5. We still see the issue.
Regarding the original issue "building some of our python models in production and noticed that we had trouble building the python models in custom schemas even if we specify this in the config.yml that the model reads from." Could you or the dbt dev team offer some suggestions on how to debug? Based on the log 3) listed below, it seems it cannot use the "dw" schema we configed and still try to use 'PUBLIC' schema.
1)Problem we face: after move to updated dbt version 1.4.5, somehow we still see it is trying to work on schema 'PUBLIC'
2) How we wrote the config.yml file: (do we write the config correct?)
- name: dim_trial_data
description: dim_trial_data
schema: dw
config:
materialized: table
tags: ['python']
3) Log error:
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 snowflake.connector.errors.ProgrammingError: 003001 (42501): SQL access control error:
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 Insufficient privileges to operate on schema 'PUBLIC'
Full log:
dwh-worker-6f4968dbd7-dvrzc
*** Reading local file: /mnt/airflow/logs/dag_id=make_dw_dbt/run_id=scheduled__2023-03-20T01:55:00+00:00/task_id=sflk_make_dw.dim_trial_data.transform/attempt=3.log
[2023-03-20, 05:24:06 UTC] {taskinstance.py:1087} INFO - Dependencies all met for <TaskInstance: make_dw_dbt.sflk_make_dw.dim_trial_data.transform scheduled__2023-03-20T01:55:00+00:00 [queued]>
[2023-03-20, 05:24:06 UTC] {taskinstance.py:1087} INFO - Dependencies all met for <TaskInstance: make_dw_dbt.sflk_make_dw.dim_trial_data.transform scheduled__2023-03-20T01:55:00+00:00 [queued]>
[2023-03-20, 05:24:06 UTC] {taskinstance.py:1283} INFO -
--------------------------------------------------------------------------------
[2023-03-20, 05:24:06 UTC] {taskinstance.py:1284} INFO - Starting attempt 3 of 3
[2023-03-20, 05:24:06 UTC] {taskinstance.py:1285} INFO -
--------------------------------------------------------------------------------
[2023-03-20, 05:24:06 UTC] {taskinstance.py:1304} INFO - Executing <Task(DbtRunOperator): sflk_make_dw.dim_trial_data.transform> on 2023-03-20 01:55:00+00:00
[2023-03-20, 05:24:07 UTC] {standard_task_runner.py:55} INFO - Started process 1303 to run task
[2023-03-20, 05:24:07 UTC] {standard_task_runner.py:82} INFO - Running: ['airflow', 'tasks', 'run', 'make_dw_dbt', 'sflk_make_dw.dim_trial_data.transform', 'scheduled__2023-03-20T01:55:00+00:00', '--job-id', '13105604', '--raw', '--subdir', '/app/app/py/projects/dwh/dwh/dags/make_dw_dbt.py', '--cfg-path', '/tmp/tmpdb40n01l']
[2023-03-20, 05:24:07 UTC] {standard_task_runner.py:83} INFO - Job 13105604: Subtask sflk_make_dw.dim_trial_data.transform
[2023-03-20, 05:24:07 UTC] {warnings.py:109} WARNING - /app/app/py/projects/dwh/.venv/lib/python3.8/site-packages/airflow/settings.py:400: DeprecationWarning: The sql_engine_encoding option in [core] has been moved to the sql_engine_encoding option in [database] - the old setting has been used, but please update your config.
engine_args["encoding"] = conf.get("database", "SQL_ENGINE_ENCODING", fallback="utf-8")
[2023-03-20, 05:24:07 UTC] {warnings.py:109} WARNING - /usr/local/lib/python3.8/configparser.py:408: FutureWarning: section/key [core/sql_engine_encoding] has been deprecated, you should use[database/sql_engine_encoding] instead. Please update your `conf.get*` call to use the new name
rawval = parser.get(section, option, raw=True, fallback=rest)
[2023-03-20, 05:24:07 UTC] {task_command.py:389} INFO - Running <TaskInstance: make_dw_dbt.sflk_make_dw.dim_trial_data.transform scheduled__2023-03-20T01:55:00+00:00 [running]> on host dwh-worker-6f4968dbd7-dvrzc
[2023-03-20, 05:24:07 UTC] {taskinstance.py:1511} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=make_dw_dbt
AIRFLOW_CTX_TASK_ID=sflk_make_dw.dim_trial_data.transform
AIRFLOW_CTX_EXECUTION_DATE=2023-03-20T01:55:00+00:00
AIRFLOW_CTX_TRY_NUMBER=3
AIRFLOW_CTX_DAG_RUN_ID=scheduled__2023-03-20T01:55:00+00:00
[2023-03-20, 05:24:07 UTC] {dbt_hook.py:117} INFO - dbt run --profiles-dir /root/.dbt --target prod --select dim_trial_data
[2023-03-20, 05:24:07 UTC] {dbt_hook.py:126} INFO - Output:
[2023-03-20, 05:24:11 UTC] {dbt_hook.py:130} INFO - [0m05:24:11 Running with dbt=1.4.5
[2023-03-20, 05:24:21 UTC] {dbt_hook.py:130} INFO - [0m05:24:21 Found 990 models, 0 tests, 0 snapshots, 0 analyses, 316 macros, 0 operations, 0 seed files, 21525 sources, 0 exposures, 0 metrics
[2023-03-20, 05:24:22 UTC] {dbt_hook.py:130} INFO - [0m05:24:22
[2023-03-20, 05:24:54 UTC] {dbt_hook.py:130} INFO - [0m05:24:54 Concurrency: 8 threads (target='prod')
[2023-03-20, 05:24:54 UTC] {dbt_hook.py:130} INFO - [0m05:24:54
[2023-03-20, 05:24:54 UTC] {dbt_hook.py:130} INFO - [0m05:24:54 1 of 1 START python table model public.dim_trial_data .......................... [RUN]
[2023-03-20, 05:25:25 UTC] {dbt_hook.py:130} INFO - [0m05:25:25 1 of 1 ERROR creating python table model public.dim_trial_data ................. [ [31mERROR [0m in 30.24s]
[2023-03-20, 05:25:25 UTC] {dbt_hook.py:130} INFO - [0m05:25:25
[2023-03-20, 05:25:25 UTC] {dbt_hook.py:130} INFO - [0m05:25:25 Finished running 1 table model in 0 hours 1 minutes and 3.26 seconds (63.26s).
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 [31mCompleted with 1 error and 0 warnings: [0m
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 [33mDatabase Error in model dim_trial_data (dbt/models/axi_dag/python/dim_trial_data.py) [0m
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 100357 (P0000): Python Interpreter Error:
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 Traceback (most recent call last):
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 File "_udf_code.py", line 116, in main
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 File "_udf_code.py", line 39, in model
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 File "/usr/lib/python_udf/e52887d7c1321f13bb7e0686e6c3e053aa8fff44f5958e2453b3993c3d297747/lib/python3.8/site-packages/snowflake/snowpark/session.py", line 1386, in create_dataframe
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 t = self.write_pandas(
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 File "/usr/lib/python_udf/e52887d7c1321f13bb7e0686e6c3e053aa8fff44f5958e2453b3993c3d297747/lib/python3.8/site-packages/snowflake/snowpark/session.py", line 1300, in write_pandas
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 raise pe
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 File "/usr/lib/python_udf/e52887d7c1321f13bb7e0686e6c3e053aa8fff44f5958e2453b3993c3d297747/lib/python3.8/site-packages/snowflake/snowpark/session.py", line 1279, in write_pandas
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 success, nchunks, nrows, ci_output = write_pandas(
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 File "/usr/lib/python_udf/e52887d7c1321f13bb7e0686e6c3e053aa8fff44f5958e2453b3993c3d297747/lib/python3.8/site-packages/snowflake/connector/pandas_tools.py", line 223, in write_pandas
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 cursor.execute(file_format_sql, _is_internal=True)
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 File "/usr/lib/python_udf/e52887d7c1321f13bb7e0686e6c3e053aa8fff44f5958e2453b3993c3d297747/lib/python3.8/site-packages/snowflake/connector/cursor.py", line 828, in execute
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 Error.errorhandler_wrapper(
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 File "/usr/lib/python_udf/e52887d7c1321f13bb7e0686e6c3e053aa8fff44f5958e2453b3993c3d297747/lib/python3.8/site-packages/snowflake/connector/errors.py", line 229, in errorhandler_wrapper
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 handed_over = Error.hand_to_other_handler(
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 File "/usr/lib/python_udf/e52887d7c1321f13bb7e0686e6c3e053aa8fff44f5958e2453b3993c3d297747/lib/python3.8/site-packages/snowflake/connector/errors.py", line 284, in hand_to_other_handler
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 cursor.errorhandler(connection, cursor, error_class, error_value)
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 File "/usr/lib/python_udf/e52887d7c1321f13bb7e0686e6c3e053aa8fff44f5958e2453b3993c3d297747/lib/python3.8/site-packages/snowflake/connector/errors.py", line 163, in default_errorhandler
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 raise error_class(
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 snowflake.connector.errors.ProgrammingError: 003001 (42501): SQL access control error:
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 Insufficient privileges to operate on schema 'PUBLIC'
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 in function DIM_TRIAL_DATA__DBT_SP with handler main
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 compiled Code at dbt/target/run/dwh/dbt/models/axi_dag/python/dim_trial_data.py
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:130} INFO - [0m05:25:27 Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1
[2023-03-20, 05:25:27 UTC] {dbt_hook.py:132} INFO - Command exited with return code 1
[2023-03-20, 05:25:27 UTC] {taskinstance.py:1772} ERROR - Task failed with exception
Traceback (most recent call last):
File "/app/app/py/projects/dwh/.venv/lib/python3.8/site-packages/airflow_dbt/operators/dbt_operator.py", line 98, in execute
self.create_hook().run_cli('run')
File "/app/app/py/projects/dwh/.venv/lib/python3.8/site-packages/airflow_dbt/hooks/dbt_hook.py", line 138, in run_cli
raise AirflowException("dbt command failed")
airflow.exceptions.AirflowException: dbt command failed
[2023-03-20, 05:25:27 UTC] {taskinstance.py:1322} INFO - Marking task as FAILED. dag_id=make_dw_dbt, task_id=sflk_make_dw.dim_trial_data.transform, execution_date=20230320T015500, start_date=20230320T052406, end_date=20230320T052527
[2023-03-20, 05:25:27 UTC] {standard_task_runner.py:100} ERROR - Failed to execute job 13105604 for task sflk_make_dw.dim_trial_data.transform (dbt command failed; 1303)
[2023-03-20, 05:25:27 UTC] {local_task_job.py:159} INFO - Task exited with return code 1
[2023-03-20, 05:25:28 UTC] {taskinstance.py:2582} INFO - 0 downstream tasks scheduled from follow-on schedule check
tldr: we override developer schema. When the original un-overriden schema specified in dbt profile is set - python model creation works exactly as expected.
We still experienced issues with this. We override schema with adding a post-fix to each developer depending on naming convention in the models running. E.g. all models in the "prep" layer are prefixed prep. My dbt profile schema is specified as "troyel" - and normally my models go into "troyelmart" or "troyel__prep" etc.
When running a very simple python model inside this set-up it failed, unless I happened to have a schema in the database named "troyel". The profile specified schema seems to be used in at least one of the processes leading up to a finished table. This was identified by running "select current_schema() " from dbt cloud and it returned null without the base schema ("troyel") existing. I am unsure where to look in core / dbt-snowflake to find the perpetrating macro/logic here to fix it.
Here's an example breaking it
https://www.loom.com/share/a8771739cbfe401da1063be953f6e234
here is the log console_output.txt
I feel like there's multiple things going on here.
First, @Tonayya, @bama-troyel, the schema
config should go in the config
block:
- name: dim_trial_data
description: dim_trial_data
config:
materialized: table
schema: dw
tags: ['python']
Just covering the basis. An alternative is to put that in the Python model file itself:
def model(dbt, session):
dbt.config(
materialized="table",
schema ="dw"
)
Then @ernestoongaro, what happens if you have access to the base schema as well as the overridden one, and you don't drop anything during the run? ;) Just checking that the base scenario - everything exists and you have access to it - does work, before going into the weeds.
Is there any update on this issue or any workarounds? I am experiencing the same error as described in the opening and the last few most recent messages when running dbt run --models <my_python_model>
on Python models
snowflake.connector.errors.ProgrammingError: Schema has to be provided to write_pandas when a database is provided
Should I expect this to be fixed? I'm using the following library versions:
dbt-core 1.4.5
dbt-snowflake 1.4.2
snowflake-connector-python 3.0.0
Please let me know if further information would be helpful
I am also on:
dbt-core 1.4.5
dbt-snowflake 1.4.2
snowflake-connector-python 3.0.0
My custom schema is DEV_MODEL_python_model_test The default schema is ADMIN
My python model:
def model(dbt, session):
"""
Entry point for dbt
"""
dbt.config(materialized = "table")
df = dbt.ref("users")
pandas_df = df.limit(100).to_pandas()
return pandas_df
The error:
02:10:48 Database Error in model python_model (models/model/python_model_test/python_model.py)
02:10:48 100357 (P0000): Python Interpreter Error:
02:10:48 Traceback (most recent call last):
02:10:48 File "_udf_code.py", line 94, in main
02:10:48 File "_udf_code.py", line 88, in materialize
02:10:48 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/snowpark/session.py", line 1453, in create_dataframe
02:10:48 t = self.write_pandas(
02:10:48 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/snowpark/session.py", line 1367, in write_pandas
02:10:48 raise pe
02:10:48 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/snowpark/session.py", line 1346, in write_pandas
02:10:48 success, nchunks, nrows, ci_output = write_pandas(
02:10:48 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/connector/pandas_tools.py", line 223, in write_pandas
02:10:48 cursor.execute(file_format_sql, _is_internal=True)
02:10:48 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/connector/cursor.py", line 829, in execute
02:10:48 Error.errorhandler_wrapper(
02:10:48 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/connector/errors.py", line 232, in errorhandler_wrapper
02:10:48 handed_over = Error.hand_to_other_handler(
02:10:48 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/connector/errors.py", line 287, in hand_to_other_handler
02:10:48 cursor.errorhandler(connection, cursor, error_class, error_value)
02:10:48 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/connector/errors.py", line 165, in default_errorhandler
02:10:48 raise error_class(
02:10:48 snowflake.connector.errors.ProgrammingError: 003001 (42501): SQL access control error:
02:10:48 Insufficient privileges to operate on schema 'ADMIN'
02:10:48 in function PYTHON_MODEL__DBT_SP with handler main
02:10:48 compiled Code at target/run/consumer_dbt/models/model/python_model_test/python_model.py
02:10:48
02:10:48 Finished running 1 table model in 0 hours 0 minutes and 24.04 seconds (24.04s).
02:10:48 Encountered an error:
FailFast Error in model python_model (models/model/python_model_test/python_model.py)
Failing early due to test failure or runtime error
Error: Process completed with exit code 2.
I've tried to write the pandas_df to a snowpark df by running write_pandas e.g.:
def model(dbt, session):
"""
Entry point for dbt
"""
dbt.config(materialized = "table")
df = dbt.ref("participant_characteristics_wide")
pandas_df = df.limit(100).to_pandas()
new_df = session.write_pandas(
df = pandas_df,
table_name = f'{dbt.this.identifier}_temp',
database = dbt.this.database,
schema = dbt.this.schema,
auto_create_table = True,
table_type = 'temp',
overwrite = True
)
return new_df
But then I get a new error which seems to have a syntax issue in the schema name (extra quotation marks):
02:51:40 Database Error in model python_model (models/model/python_model_test/python_model.py)
02:51:40 100357 (P0000): Python Interpreter Error:
02:51:40 Traceback (most recent call last):
02:51:40 File "_udf_code.py", line 102, in main
02:51:40 File "_udf_code.py", line 16, in model
02:51:40 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/snowpark/session.py", line 1367, in write_pandas
02:51:40 raise pe
02:51:40 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/snowpark/session.py", line 1346, in write_pandas
02:51:40 success, nchunks, nrows, ci_output = write_pandas(
02:51:40 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/connector/pandas_tools.py", line 202, in write_pandas
02:51:40 cursor.execute(drop_table_sql, _is_internal=True)
02:51:40 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/connector/cursor.py", line 829, in execute
02:51:40 Error.errorhandler_wrapper(
02:51:40 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/connector/errors.py", line 232, in errorhandler_wrapper
02:51:40 handed_over = Error.hand_to_other_handler(
02:51:40 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/connector/errors.py", line 287, in hand_to_other_handler
02:51:40 cursor.errorhandler(connection, cursor, error_class, error_value)
02:51:40 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/connector/errors.py", line 165, in default_errorhandler
02:51:40 raise error_class(
02:51:40 snowflake.connector.errors.ProgrammingError: 002003 (02000): SQL compilation error:
02:51:40 Schema 'STAGE."DEV_MODEL_python_model_test"' does not exist or not authorized.
02:51:40 in function PYTHON_MODEL__DBT_SP with handler main
02:51:40 compiled Code at target/run/consumer_dbt/models/model/python_model_test/python_model.py
02:51:40
02:51:40 Finished running 1 table model in 0 hours 0 minutes and 23.87 seconds (23.87s).
02:51:40 Encountered an error:
FailFast Error in model python_model (models/model/python_model_test/python_model.py)
Failing early due to test failure or runtime error
Error: Process completed with exit code 2.
Oddly, when I try setting quote_identifiers = False
in the write_pandas function
, I get the first error (the default schema) again.
Hey @satpai1215 and @eaphilli,
To cover the basics:
ADMIN_ DEV_MODEL_python_model_test
, not DEV_MODEL_python_model_test
. I'm wondering if in your case the resulting schema doesn't exist and you don't have the privilege to create the custom schema, resulting in a weird behaviorStarting from a new project (dbt init
), configure a schema in your profiles.
Then add the following files in your project:
In /models
, my_python_model.py
:
def model(dbt, session):
dbt.config(
materialized="table",
schema="customSchemaSuffix"
)
df = dbt.ref("my_first_dbt_model")
return df
In /models
, my_python_model.yml
:
version: 2
models:
- name: my_python_model
description: my_python_model
config:
materialized: table
schema: customSchemaSuffix
tags: ['python']
You can define the custom schema property in either the property file or the model file. Both should work.
Do make sure that all of the following schemas exist and you have access to them (or you have enough privilege to have dbt create them on the fly):
Thank you for your response here!
I can confirm that your base case scenario is working on my end. I've isolated the issue to problems occurring when returning a newly created dataframe from the model
function (as opposed to one derived from dbt.ref()
).
Here is my test case below, that is returning the error that follows. Is there something I should be doing differently here? (apologies if this is not the correct forum for these types inquiries):
def model(dbt, session):
dbt.config(
materialized="table",
schema="MY_CUSTOM_SCHEMA",
alias="_TEST_PYTHON_MODEL",
)
df = pd.DataFrame({
'City': ['Buenos Aires', 'Brasilia', 'Santiago', 'Bogota', 'Caracas'],
'Country': ['Argentina', 'Brazil', 'Chile', 'Colombia', 'Venezuela'],
'Latitude': [-34.58, -15.78, -33.45, 4.60, 10.48],
'Longitude': [-58.66, -47.91, -70.66, -74.08, -66.86]
})
return df
Error Trace
15:53:50 Database Error in model TEST_PYTHON_MODEL (models/python/TEST_PYTHON_MODEL.py)
15:53:50 100357 (P0000): Python Interpreter Error:
15:53:50 Traceback (most recent call last):
15:53:50 File "_udf_code.py", line 108, in main
15:53:50 File "_udf_code.py", line 102, in materialize
15:53:50 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/snowpark/session.py", line 1453, in create_dataframe
15:53:50 t = self.write_pandas(
15:53:50 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/snowpark/session.py", line 1367, in write_pandas
15:53:50 raise pe
15:53:50 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/snowpark/session.py", line 1346, in write_pandas
15:53:50 success, nchunks, nrows, ci_output = write_pandas(
15:53:50 File "/usr/lib/python_udf/694df3cf538f091a277e3df26e69af371a3630fe8656f8ad7d386069b297bf34/lib/python3.8/site-packages/snowflake/connector/pandas_tools.py", line 124, in write_pandas
15:53:50 raise ProgrammingError(
15:53:50 snowflake.connector.errors.ProgrammingError: Schema has to be provided to write_pandas when a database is provided
15:53:50 in function _TEST_PYTHON_MODEL__DBT_SP with handler main
15:53:50 compiled Code at /models/python/TEST_PYTHON_MODEL.py
15:53:50
15:53:50 Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1
@patkearns10 and I had a chance to live-debug this with a user who's been running into the issue, and I think we got to the bottom of it! I'm still not sure why this is cropping up for only some users, and not others (me!) — and I'm pretty sure the root-cause fix here was (or should have been) in https://github.com/snowflakedb/snowflake-connector-python/pull/1274 — but this managed to reliably solve the problem.
The quickest fix was to add two lines within the Python model code, right before the return
of the Pandas dataframe:
def model(dbt, session):
dbt.config(schema="custom_schema")
pandas_df = dbt.ref("my_model").to_pandas()
# add these lines
session.use_database(dbt.this.database)
session.use_schema(dbt.this.schema)
return pandas_df
I believe we could just fix this for everyone by updating these lines to include those session.use_*
calls as well:
https://github.com/dbt-labs/dbt-snowflake/blob/0f77a0886f3308aea0d8313ff2fa82e7378bb8ce/dbt/include/snowflake/macros/materializations/table.sql#L49-L51
Namely, by changing that to:
if isinstance(df, pandas.core.frame.DataFrame):
session.use_database(target_relation.database)
session.use_schema(target_relation.schema)
# session.write_pandas does not have overwrite function
df = session.createDataFrame(df)
Tracking this for our next sprint
This issue also came up in community Slack: https://getdbt.slack.com/archives/CBSQTAPLG/p1683920034037219
I'm soon going to merge the fix Jeremy provided but admittedly on our system, as alluded to above, it's hard to recreate this. Even after this issue closes and the fix has been backported, I heartily encourage users that continue to find custom schema problems to speak up!
I have just ran into this issue this week with switching our Python models to custom schemas. Is there an ETA or release version I can be looking for the fix to be released on? Thank you.
Hey @waitsfornone, this should be out right now in 1.3.2, 1.4.3 and 1.5.1. Could you check it's fixed on your end and let me know?
Is this a new bug in dbt-core?
Current Behavior
When I try to create a python model using v1.3, I get an error that a schema was not provided. I have set a schema in my
dbt_project.yml
file for the directory that the model is in, and have thegenerate_schema_name
set to change the schema based on the environment.Expected Behavior
The python model will run and create a model in the same schema as sql models in the same directory
Steps To Reproduce
dbt_project.yml
to a use a given schema, e.g.dbt/models/datamarts/dim/test_python.py
in that directory, e.g:generate_schema_name_for_env
andgenerate_schema_name
:{% macro generate_schema_name(custom_schema_name, node) -%} {{ generate_schema_name_for_env(custom_schema_name, node) }} {%- endmacro %}
Environment
Which database adapter are you using with dbt?
snowflake
Additional Context
No response