Snowflake-Labs / sfquickstarts

Follow along with our tutorials to get you up and running with the Snowflake Data Cloud.
Apache License 2.0
322 stars 608 forks source link

Leverage dbt Cloud to Generate ML ready pipelines using Snowpark python - code fails at step 8 in dbt #1135

Open beata9876 opened 5 months ago

beata9876 commented 5 months ago

I'm trying to complete step 8 but unfortunately it fails inside dbt at node "ml_data_prep":

image

The begining of the error: $$ CALL ml_data_prep__dbt_sp(); 20:46:38 Opening a new connection, currently in state closed 20:47:08 Snowflake adapter: Snowflake query id: 01b31f1e-0000-ad79-0000-00008f71a371 20:47:08 Snowflake adapter: Snowflake error: 100357 (P0000): Python Interpreter Error: Traceback (most recent call last): File "_udf_code.py", line 151, in main File "_udf_code.py", line 36, in model File "/usr/lib/python_udf/f4e509cd372a09942abc6ef933b2ac5b1a7b1eb2a9f369fd8e4df343ffe35c5c/lib/python3.8/site-packages/pandas/core/groupby/groupby.py", line 2263, in sum result = self._agg_general(

And the end of the log file:

image
sfc-gh-cweidmann commented 5 months ago

There is a quick fix to this:

In models/ml/prep_encoding_splitting/ml_data_prep.py replace line 5 dbt.config(packages=["pandas"]) with dbt.config(packages=["pandas==1.5.3"])

Repeat this with the file models/ml/prep_encoding_splitting/covariate_encoding.py and replace line 8 dbt.config(packages=["pandas“,"numpy","scikit-learn"]) with dbt.config(packages=["pandas==1.5.3","numpy","scikit-learn"])

After saving these two changes, the 'dbt build' should run successfully and everything should work as expected.

We are working with dbt to fix this Quickstart guide.

beata9876 commented 5 months ago

Success! Thanks a lot :)