Open gwenwindflower opened 6 months ago
Thanks for opening this @gwenwindflower !
Which adapter did you use? Could you provide a simple dbt python model that exhibits this issue?
Was it dbt-snowflake with a model like this, by any chance?
import pandas as pd
import numpy as np
import re
def model(dbt, session):
dbt.config(packages=["pandas", "numpy", "re"])
df = pd.DataFrame({"hello": ["world"]})
return df
And an error like this?
00:23:57 Database Error in model my_python_model (models/my_python_model.py)
100357 (P0000): Cannot create a Python function with the specified packages. Please check your packages specification and try again.
compiled Code at target/run/my_project/models/my_python_model.py
hey @dbeatty10, sorry for the lack of a firsthand repro, I reported this based on a user in the Community so didn't get the error myself! @aranke suggested it could be worthwhile to just fix this rather than updating the docs, and I tend to agree, particularly with the offered idea of a clear Warning over a mysterious Error. based on my conversation with the Community-member, this looks like exactly the simplified version of the model he was creating and error he was getting that confused him. Here's a link to the thread.
@aranke could you share the details of your proposed approach for this scenario?
If you can provide links to the relevant area(s) of the source code, that would be even better.
Code: TK
Python built-in modules: https://docs.python.org/3/library/sys.html#sys.builtin_module_names
Is this your first time submitting a feature request?
Describe the feature
Right now, if a user wants to use
re
,os
, etc in a Python model, they would rightfully think it important to add it to thepackages
list config argument of the model. In fact, dbt will throw a 'package not found' error for packages that aren't 3rd party. The Right Way at present is to just import and use them, but we don't flag that anywhere in the docs. It would be good to filter out the standard library packages and perhaps throw a warning instead of an error here, letting people know this isn't necessary, but still proceeding.At present you need to do this, which is not super obvious:
Describe alternatives you've considered
Who will this benefit?
Users of Python models.
Are you interested in contributing this feature?
No
Anything else?