dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.61k stars 1.59k forks source link

[CT-2672] [Bug] Cannot import name 'Unpack' from 'typing_extensions' (dbt-core 1.5.1) #7828

Closed Almaz-KG closed 10 months ago

Almaz-KG commented 1 year ago

Is this a new bug in dbt-core?

Current Behavior

I’m trying to migrate to dbt-core 1.5.1 and play with programmatic invocations

Expected Behavior

can import and use dbtRunner, dbtRunnerResult from python environment

Steps To Reproduce

  1. pip install -r requirements.txt (see in the bellow section)
  2. from dbt.cli.main import dbtRunner, dbtRunnerResult
    # initialize
    dbt_runner = dbtRunner()
    cli_args = ["run", "--select", "tag:my_tag"]
    
    # run the command
    res: dbtRunnerResult = dbt_runner.invoke(cli_args)
    
    # inspect the results
    for r in res.result:
        print(f"{r.node.name}: {r.status}")

Relevant log output

ImportError                               Traceback (most recent call last)
<command-1597366681957068> in <cell line: 1>()
----> 1 from dbt.cli.main import dbtRunner, dbtRunnerResult
      2 
      3 def dbt(args):
      4     # initialize
      5     dbt_runner = dbtRunner()

/databricks/python_shell/dbruntime/PythonPackageImportsInstrumentation/__init__.py in import_patch(name, globals, locals, fromlist, level)
    169             # Import the desired module. If you’re seeing this while debugging a failed import,
    170             # look at preceding stack frames for relevant error information.
--> 171             original_result = python_builtin_import(name, globals, locals, fromlist, level)
    172 
    173             is_root_import = thread_local._nest_level == 1

/local_disk0/.ephemeral_nfs/envs/pythonEnv-8dd839ed-b379-4ba3-b2af-457367cb74b7/lib/python3.9/site-packages/dbt/cli/__init__.py in <module>
----> 1 from .main import cli as dbt_cli  # noqa

/databricks/python_shell/dbruntime/PythonPackageImportsInstrumentation/__init__.py in import_patch(name, globals, locals, fromlist, level)
    169             # Import the desired module. If you’re seeing this while debugging a failed import,
    170             # look at preceding stack frames for relevant error information.
--> 171             original_result = python_builtin_import(name, globals, locals, fromlist, level)
    172 
    173             is_root_import = thread_local._nest_level == 1

/local_disk0/.ephemeral_nfs/envs/pythonEnv-8dd839ed-b379-4ba3-b2af-457367cb74b7/lib/python3.9/site-packages/dbt/cli/main.py in <module>
     11 )
     12 
---> 13 from dbt.cli import requires, params as p
     14 from dbt.cli.exceptions import (
     15     DbtInternalException,

/databricks/python_shell/dbruntime/PythonPackageImportsInstrumentation/__init__.py in import_patch(name, globals, locals, fromlist, level)
    169             # Import the desired module. If you’re seeing this while debugging a failed import,
    170             # look at preceding stack frames for relevant error information.
--> 171             original_result = python_builtin_import(name, globals, locals, fromlist, level)
    172 
    173             is_root_import = thread_local._nest_level == 1

/local_disk0/.ephemeral_nfs/envs/pythonEnv-8dd839ed-b379-4ba3-b2af-457367cb74b7/lib/python3.9/site-packages/dbt/cli/requires.py in <module>
----> 1 import dbt.tracking
      2 from dbt.version import installed as installed_version
      3 from dbt.adapters.factory import adapter_management, register_adapter
      4 from dbt.flags import set_flags, get_flag_dict
      5 from dbt.cli.exceptions import (

/databricks/python_shell/dbruntime/PythonPackageImportsInstrumentation/__init__.py in import_patch(name, globals, locals, fromlist, level)
    169             # Import the desired module. If you’re seeing this while debugging a failed import,
    170             # look at preceding stack frames for relevant error information.
--> 171             original_result = python_builtin_import(name, globals, locals, fromlist, level)
    172 
    173             is_root_import = thread_local._nest_level == 1

/local_disk0/.ephemeral_nfs/envs/pythonEnv-8dd839ed-b379-4ba3-b2af-457367cb74b7/lib/python3.9/site-packages/dbt/tracking.py in <module>
     13 from snowplow_tracker import logger as sp_logger
     14 
---> 15 from dbt import version as dbt_version
     16 from dbt.clients.yaml_helper import safe_load, yaml  # noqa:F401
     17 from dbt.events.functions import fire_event, get_invocation_id

/databricks/python_shell/dbruntime/PythonPackageImportsInstrumentation/__init__.py in import_patch(name, globals, locals, fromlist, level)
    169             # Import the desired module. If you’re seeing this while debugging a failed import,
    170             # look at preceding stack frames for relevant error information.
--> 171             original_result = python_builtin_import(name, globals, locals, fromlist, level)
    172 
    173             is_root_import = thread_local._nest_level == 1

/local_disk0/.ephemeral_nfs/envs/pythonEnv-8dd839ed-b379-4ba3-b2af-457367cb74b7/lib/python3.9/site-packages/dbt/version.py in <module>
      8 import requests
      9 
---> 10 import dbt.exceptions
     11 import dbt.semver
     12 

/databricks/python_shell/dbruntime/PythonPackageImportsInstrumentation/__init__.py in import_patch(name, globals, locals, fromlist, level)
    169             # Import the desired module. If you’re seeing this while debugging a failed import,
    170             # look at preceding stack frames for relevant error information.
--> 171             original_result = python_builtin_import(name, globals, locals, fromlist, level)
    172 
    173             is_root_import = thread_local._nest_level == 1

/local_disk0/.ephemeral_nfs/envs/pythonEnv-8dd839ed-b379-4ba3-b2af-457367cb74b7/lib/python3.9/site-packages/dbt/exceptions.py in <module>
      6 from typing import Any, Dict, List, Mapping, Optional, Union
      7 
----> 8 from dbt.dataclass_schema import ValidationError
      9 from dbt.events.helpers import env_secrets, scrub_secrets
     10 from dbt.node_types import NodeType

/databricks/python_shell/dbruntime/PythonPackageImportsInstrumentation/__init__.py in import_patch(name, globals, locals, fromlist, level)
    169             # Import the desired module. If you’re seeing this while debugging a failed import,
    170             # look at preceding stack frames for relevant error information.
--> 171             original_result = python_builtin_import(name, globals, locals, fromlist, level)
    172 
    173             is_root_import = thread_local._nest_level == 1

/local_disk0/.ephemeral_nfs/envs/pythonEnv-8dd839ed-b379-4ba3-b2af-457367cb74b7/lib/python3.9/site-packages/dbt/dataclass_schema.py in <module>
     13 
     14 # type: ignore
---> 15 from mashumaro import DataClassDictMixin
     16 from mashumaro.config import TO_DICT_ADD_OMIT_NONE_FLAG, BaseConfig as MashBaseConfig
     17 from mashumaro.types import SerializableType, SerializationStrategy

/databricks/python_shell/dbruntime/PythonPackageImportsInstrumentation/__init__.py in import_patch(name, globals, locals, fromlist, level)
    169             # Import the desired module. If you’re seeing this while debugging a failed import,
    170             # look at preceding stack frames for relevant error information.
--> 171             original_result = python_builtin_import(name, globals, locals, fromlist, level)
    172 
    173             is_root_import = thread_local._nest_level == 1

/local_disk0/.ephemeral_nfs/envs/pythonEnv-8dd839ed-b379-4ba3-b2af-457367cb74b7/lib/python3.9/site-packages/mashumaro/__init__.py in <module>
----> 1 from mashumaro.exceptions import MissingField
      2 from mashumaro.helper import field_options, pass_through
      3 from mashumaro.mixins.dict import DataClassDictMixin
      4 
      5 __all__ = [

/databricks/python_shell/dbruntime/PythonPackageImportsInstrumentation/__init__.py in import_patch(name, globals, locals, fromlist, level)
    169             # Import the desired module. If you’re seeing this while debugging a failed import,
    170             # look at preceding stack frames for relevant error information.
--> 171             original_result = python_builtin_import(name, globals, locals, fromlist, level)
    172 
    173             is_root_import = thread_local._nest_level == 1

/local_disk0/.ephemeral_nfs/envs/pythonEnv-8dd839ed-b379-4ba3-b2af-457367cb74b7/lib/python3.9/site-packages/mashumaro/exceptions.py in <module>
      1 from typing import Any, Optional, Type
      2 
----> 3 from mashumaro.core.meta.helpers import type_name
      4 
      5 

/databricks/python_shell/dbruntime/PythonPackageImportsInstrumentation/__init__.py in import_patch(name, globals, locals, fromlist, level)
    169             # Import the desired module. If you’re seeing this while debugging a failed import,
    170             # look at preceding stack frames for relevant error information.
--> 171             original_result = python_builtin_import(name, globals, locals, fromlist, level)
    172 
    173             is_root_import = thread_local._nest_level == 1

/local_disk0/.ephemeral_nfs/envs/pythonEnv-8dd839ed-b379-4ba3-b2af-457367cb74b7/lib/python3.9/site-packages/mashumaro/core/meta/helpers.py in <module>
     23 
     24 import typing_extensions
---> 25 from typing_extensions import Unpack
     26 
     27 from mashumaro.core.const import (

ImportError: cannot import name 'Unpack' from 'typing_extensions' (/databricks/python/lib/python3.9/site-packages/typing_extensions.py)

Environment

- OS: Linux 5.15.0-1035-aws
- Python: 3.9.5
- dbt: dbt-core==1.5.1

Content of requirements.txt

dbt-core==1.5.1
dbt-databricks==1.5.2
dbt-spark==1.5.0
dbt-spark[PyHive]==1.5.0
sqlfluff==2.1.1
sqlfluff-templater-dbt==2.1.1
databricks-sql-cli==0.3.2
dryable==1.1.0
tqdm==4.64.0
elementary-data==0.8.0
elementary-data[databricks]==0.8.0
requests
types-protobuf==4.23.0.1 # added to play around it
typing_extensions==4.4.0 # the same result with 4.4.6

Which database adapter are you using with dbt?

spark

Additional Context

No response

dbeatty10 commented 1 year ago

That is awesome that you are tying out programmatic invocations @Almaz-KG 🤩 !

I didn't try out with the requirements.txt you listed yet. But with a fresh virtual environment with just dbt-core and dbt-postgres, the following worked for me.

Reprex

models/my_model.sql

{{ config(tags="my_tag") }}

select 1 as id

runner.py

from dbt.cli.main import dbtRunner, dbtRunnerResult

# initialize
dbt_runner = dbtRunner()
cli_args = ["run", "--select", "tag:my_tag"]

# run the command
res: dbtRunnerResult = dbt_runner.invoke(cli_args)

# inspect the results
for r in res.result:
    print(f"{r.node.name}: {r.status}")

This worked for me:

dbt run --select tag:my_tag

And so did this:

python runner.py

Ouput:

17:43:23  Running with dbt=1.5.0
17:43:23  Found 1 model, 0 tests, 0 snapshots, 0 analyses, 307 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics, 0 groups
17:43:23  
17:43:23  Concurrency: 5 threads (target='postgres')
17:43:23  
17:43:23  1 of 1 START sql view model dbt_dbeatty.my_model ............................... [RUN]
17:43:23  1 of 1 OK created sql view model dbt_dbeatty.my_model .......................... [CREATE VIEW in 0.13s]
17:43:23  
17:43:23  Finished running 1 view model in 0 hours 0 minutes and 0.35 seconds (0.35s).
17:43:23  
17:43:23  Completed successfully
17:43:23  
17:43:23  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
my_model: success
Fatal1ty commented 1 year ago
ImportError: cannot import name 'Unpack' from 'typing_extensions' (/databricks/python/lib/python3.9/site-packages/typing_extensions.py)

Oh, I've come across something like this after typing-extensions upgrade and made a fix in this commit:

It will be included in the upcoming release this or next week.

dbeatty10 commented 1 year ago

Thanks for that keen insight @Fatal1ty !

In the meantime, are there any specific versions of mashumaro that we should explicitly exclude in our setup.py? https://github.com/dbt-labs/dbt-core/blob/98d1a94b608adcf0c81b5dcda0f5e5362e0620a8/core/setup.py#L56

Fatal1ty commented 1 year ago

@dbeatty10

I think it's worth to exclude only those versions that break something in dbt-core. Not taking into account the transition to major version 3, I remember that 3.0.2 and 3.1.0 were problematic, so they are candidates for exclusion. On the other hand if you rely on some functionality that exists only from a version X then you should obviously pin that version as the minimal one and remove all exclusions before X as redundant:

mashumaro[msgpack]>=X

If I were in your place, I would be looking at pull requests from dependabot and either: a) exclude package versions not passing the tests leaving the minimal version X b) bump the minimal version X to the following patch version that passes the tests

The option "a" is better for users because it makes it easier to use dbt-core in combination with other packages that may also depend on mashumaro. But if dbt-core is positioned as the product rather than as a library than option "b" or even strict pin to the specific version of a package is also a good choice.

ghost commented 1 year ago

Hey @Fatal1ty ! WDYT, should I wait for the fixes in dbt-core or can I hack this dependency issue by myself? Maybe I can force pip to install a specific version of mashumaro in the requirements.txt file

Fatal1ty commented 1 year ago

Oh, I've come across something like this after typing-extensions upgrade and made a fix in this commit

Forget what I wrote before. What I encountered is unrelated to the current issue.

Has anyone else tried to reproduce this issue? Because I can't:

(test)  ~/projects/test > python --version
Python 3.9.5
(test)  ~/projects/test > pip freeze
about-time==3.1.1
agate==1.7.0
alembic==1.11.1
alive-progress==2.3.1
appdirs==1.4.4
attrs==23.1.0
Babel==2.12.1
backoff==2.2.1
beautifulsoup4==4.12.2
boto3==1.26.153
botocore==1.29.153
cachetools==5.3.1
certifi==2023.5.7
cffi==1.15.1
chardet==5.1.0
charset-normalizer==3.1.0
cli-helpers==2.3.0
click==8.1.3
colorama==0.4.6
configobj==5.0.8
cryptography==41.0.1
databricks-sdk==0.1.6
databricks-sql-cli==0.3.2
databricks-sql-connector==2.6.1
dbt-core==1.5.1
dbt-databricks==1.5.2
dbt-extractor==0.4.1
dbt-spark==1.5.0
diff-cover==7.6.0
dryable==1.1.0
elementary-data==0.8.0
et-xmlfile==1.1.0
exceptiongroup==1.1.1
future==0.18.3
google-api-core==2.11.0
google-auth==2.20.0
google-cloud-core==2.3.2
google-cloud-storage==2.9.0
google-crc32c==1.5.0
google-resumable-media==2.5.0
googleapis-common-protos==1.59.1
grapheme==0.6.0
greenlet==2.0.2
hologram==0.0.16
idna==3.4
importlib-metadata==6.6.0
iniconfig==2.0.0
isodate==0.6.1
jaraco.classes==3.2.3
jeepney==0.8.0
Jinja2==3.1.2
jinja2-simple-tags==0.5.0
jmespath==1.0.1
jsonschema==4.17.3
keyring==23.13.1
leather==0.3.4
Logbook==1.5.3
lz4==4.3.2
Mako==1.2.4
markdown-it-py==3.0.0
MarkupSafe==2.1.3
mashumaro==3.6
mdurl==0.1.2
minimal-snowplow-tracker==0.0.2
monotonic==1.6
more-itertools==9.1.0
msgpack==1.0.5
networkx==2.8.8
numpy==1.23.4
oauthlib==3.2.2
openpyxl==3.1.2
packaging==22.0
pandas==1.3.4
parsedatetime==2.4
pathspec==0.11.1
pluggy==1.0.0
posthog==2.5.0
prompt-toolkit==3.0.38
protobuf==4.23.2
pure-sasl==0.6.2
pyarrow==12.0.1
pyasn1==0.5.0
pyasn1-modules==0.3.0
pycparser==2.21
pydantic==1.10.9
pyfiglet==0.8.post1
Pygments==2.15.1
PyHive==0.6.5
pyrsistent==0.19.3
pytest==7.3.2
pytest-parametrization==2022.2.1
python-dateutil==2.8.2
python-slugify==8.0.1
pytimeparse==1.1.8
pytz==2023.3
PyYAML==6.0
ratelimit==2.2.1
regex==2023.6.3
requests==2.28.2
rich==13.4.2
rsa==4.9
ruamel.yaml==0.17.31
ruamel.yaml.clib==0.2.7
s3transfer==0.6.1
sasl==0.3.1
SecretStorage==3.3.3
six==1.16.0
slack-sdk==3.21.3
soupsieve==2.4.1
SQLAlchemy==1.4.48
sqlfluff==2.1.1
sqlfluff-templater-dbt==2.1.1
sqlparams==5.1.0
sqlparse==0.4.3
tabulate==0.9.0
tblib==1.7.0
text-unidecode==1.3
thrift==0.16.0
thrift-sasl==0.4.3
toml==0.10.2
tomli==2.0.1
tqdm==4.64.0
types-protobuf==4.23.0.1
typing-extensions==4.4.0
urllib3==1.26.16
wcwidth==0.2.6
Werkzeug==2.3.6
zipp==3.15.0
(test)  ~/projects/test > python test.py
Traceback (most recent call last):
  File "/home/my_user/projects/test/test.py", line 10, in <module>
    for r in res.result:
TypeError: 'NoneType' object is not iterable

@Almaz-KG It looks like you failed to do imports on the first line in your environment:

----> 1 from dbt.cli.main import dbtRunner, dbtRunnerResult

can I hack this dependency issue by myself?

I bet you have typing-extensions<4.1.0 installed.

ghost commented 1 year ago

I bet you have typing-extensions<4.1.0 installed.

No, I have version 4.6.3. The full list of deps with their versions (got from pip freeze) looks like this

about-time==3.1.1
agate==1.7.0
alembic==1.11.1
alive-progress==2.3.1
appdirs==1.4.4
argon2-cffi==20.1.0
async-generator==1.10
attrs==21.2.0
Babel==2.12.1
backcall==0.2.0
backoff==2.2.1
backports.entry-points-selectable==1.1.1
beautifulsoup4==4.12.2
black==22.3.0
bleach==4.0.0
boto3==1.21.18
botocore==1.24.18
cachetools==5.3.1
certifi==2021.10.8
cffi==1.14.6
chardet==4.0.0
charset-normalizer==2.0.4
cli-helpers==2.3.0
click==8.1.3
colorama==0.4.6
configobj==5.0.8
cryptography==3.4.8
cycler==0.10.0
Cython==0.29.24
databricks-sdk==0.1.6
databricks-sql-cli==0.3.2
databricks-sql-connector==2.6.1
dbt-core==1.5.1
dbt-databricks==1.5.4
dbt-extractor==0.4.1
dbt-spark==1.5.0
dbus-python==1.2.16
debugpy==1.4.1
decorator==5.1.0
defusedxml==0.7.1
diff-cover==7.6.0
distlib==0.3.6
distro==1.4.0
distro-info===0.23ubuntu1
dryable==1.1.0
elementary-data==0.8.0
entrypoints==0.3
et-xmlfile==1.1.0
exceptiongroup==1.1.1
facets-overview==1.0.0
filelock==3.8.0
future==0.18.3
google-api-core==2.11.0
google-auth==2.20.0
google-cloud-core==2.3.2
google-cloud-storage==2.9.0
google-crc32c==1.5.0
google-resumable-media==2.5.0
googleapis-common-protos==1.59.1
grapheme==0.6.0
greenlet==2.0.2
hologram==0.0.16
idna==3.2
importlib-metadata==6.6.0
iniconfig==2.0.0
ipykernel==6.12.1
ipython==7.32.0
ipython-genutils==0.2.0
ipywidgets==7.7.0
isodate==0.6.1
jaraco.classes==3.2.3
jedi==0.18.0
jeepney==0.8.0
Jinja2==3.1.2
jinja2-simple-tags==0.5.0
jmespath==0.10.0
joblib==1.0.1
jsonschema==3.2.0
jupyter-client==6.1.12
jupyter-core==4.8.1
jupyterlab-pygments==0.1.2
jupyterlab-widgets==1.0.0
keyring==23.13.1
kiwisolver==1.3.1
leather==0.3.4
Logbook==1.5.3
lz4==4.3.2
Mako==1.2.4
markdown-it-py==3.0.0
MarkupSafe==2.1.3
mashumaro==3.6
matplotlib==3.4.3
matplotlib-inline==0.1.2
mdurl==0.1.2
minimal-snowplow-tracker==0.0.2
mistune==0.8.4
monotonic==1.6
more-itertools==9.1.0
msgpack==1.0.5
mypy-extensions==0.4.3
nbclient==0.5.3
nbconvert==6.1.0
nbformat==5.1.3
nest-asyncio==1.5.1
networkx==2.8.8
notebook==6.4.5
numpy==1.23.4
oauthlib==3.2.2
openpyxl==3.1.2
packaging==21.0
pandas==1.3.4
pandocfilters==1.4.3
parsedatetime==2.4
parso==0.8.2
pathspec==0.9.0
patsy==0.5.2
pexpect==4.8.0
pickleshare==0.7.5
Pillow==8.4.0
platformdirs==2.5.2
plotly==5.9.0
pluggy==1.0.0
posthog==2.5.0
prometheus-client==0.11.0
prompt-toolkit==3.0.38
protobuf==4.23.2
psutil==5.8.0
psycopg2==2.9.3
ptyprocess==0.7.0
pyarrow==7.0.0
pyasn1==0.5.0
pyasn1-modules==0.3.0
pycparser==2.20
pydantic==1.10.9
pyfiglet==0.8.post1
Pygments==2.15.1
PyGObject==3.36.0
pyodbc==4.0.31
pyparsing==3.0.4
pyrsistent==0.18.0
pytest==7.3.2
pytest-parametrization==2022.2.1
python-apt==2.0.1+ubuntu0.20.4.1
python-dateutil==2.8.2
python-slugify==8.0.1
pytimeparse==1.1.8
pytz==2021.3
PyYAML==6.0
pyzmq==22.2.1
ratelimit==2.2.1
regex==2023.6.3
requests==2.28.2
requests-unixsocket==0.2.0
rich==13.4.2
rsa==4.9
ruamel.yaml==0.17.31
ruamel.yaml.clib==0.2.7
s3transfer==0.5.2
scikit-learn==0.24.2
scipy==1.7.1
seaborn==0.11.2
SecretStorage==3.3.3
Send2Trash==1.8.0
six==1.16.0
slack-sdk==3.21.3
soupsieve==2.4.1
SQLAlchemy==1.4.48
sqlfluff==2.1.1
sqlfluff-templater-dbt==2.1.1
sqlparams==5.1.0
sqlparse==0.4.3
ssh-import-id==5.10
statsmodels==0.12.2
tabulate==0.9.0
tblib==1.7.0
tenacity==8.0.1
terminado==0.9.4
testpath==0.5.0
text-unidecode==1.3
threadpoolctl==2.2.0
thrift==0.16.0
tokenize-rt==4.2.1
toml==0.10.2
tomli==2.0.1
tornado==6.1
tqdm==4.64.0
traitlets==5.1.0
typing_extensions==4.6.3
unattended-upgrades==0.1
urllib3==1.26.7
virtualenv==20.8.0
wcwidth==0.2.5
webencodings==0.5.1
Werkzeug==2.3.6
widgetsnbextension==3.6.0
zipp==3.15.0

I've also played today with docker images (with different python versions, 3.9, 3.9.5, 3.10), and can't reproduce the issue. But, I can reproduce it in databricks cluster. IDK where is the difference comes in

dbeatty10 commented 1 year ago

I've also tried to reproduce this, but have not been able to.

frankivo commented 1 year ago

On dbt-core==1.6.1 and dbt-core==1.6.2 I get something similar (on databricks): ImportError: cannot import name 'override' from 'typing_extensions' (/databricks/python/lib/python3.10/site-packages/typing_extensions.py)

Fixed by adding typing_extensions==4.7.1 as a dependency

jordandakota commented 11 months ago

I've also received the same issue with typing extensions, adding 4.7.1 and it seems to work again but isn't preferable. Should only need to pin a dependency to dbt-databricks not also typing_extensions.

leo-schick commented 11 months ago

On dbt-core==1.6.1 and dbt-core==1.6.2 I get something similar (on databricks): ImportError: cannot import name 'override' from 'typing_extensions' (/databricks/python/lib/python3.10/site-packages/typing_extensions.py)

Fixed by adding typing_extensions==4.7.1 as a dependency

I got the same issue when running on Databricks Runtime 13.3 LTS. But on Databricks Runtime 12.2 LTS it worked out fine. I did not try to fix typing_extensions==4.7.1 as you did but rather tried to update to the newest version of typing_extensions which did not make it work.

This was tried with dbt-core 1.6.2 and dbt-databricks 1.6.4 + 1.6.5.

Here my full stack trace for further investigation for the devs:

Traceback (most recent call last):
  File "/databricks/python3/bin/dbt", line 5, in <module>
    from dbt.cli.main import cli
  File "/databricks/python/lib/python3.10/site-packages/dbt/cli/__init__.py", line 1, in <module>
    from .main import cli as dbt_cli  # noqa
  File "/databricks/python/lib/python3.10/site-packages/dbt/cli/main.py", line 13, in <module>
    from dbt.cli import requires, params as p
  File "/databricks/python/lib/python3.10/site-packages/dbt/cli/requires.py", line 3, in <module>
    from dbt.adapters.factory import adapter_management, register_adapter
  File "/databricks/python/lib/python3.10/site-packages/dbt/adapters/factory.py", line 8, in <module>
    from dbt.adapters.base.plugin import AdapterPlugin
  File "/databricks/python/lib/python3.10/site-packages/dbt/adapters/base/__init__.py", line 6, in <module>
    from dbt.adapters.base.connections import BaseConnectionManager  # noqa: F401
  File "/databricks/python/lib/python3.10/site-packages/dbt/adapters/base/connections.py", line 35, in <module>
    from dbt.contracts.graph.manifest import Manifest
  File "/databricks/python/lib/python3.10/site-packages/dbt/contracts/graph/manifest.py", line 27, in <module>
    from dbt.contracts.graph.nodes import (
  File "/databricks/python/lib/python3.10/site-packages/dbt/contracts/graph/nodes.py", line 61, in <module>
    from dbt_semantic_interfaces.parsing.where_filter_parser import WhereFilterParser
  File "/databricks/python/lib/python3.10/site-packages/dbt_semantic_interfaces/parsing/where_filter_parser.py", line 14, in <module>
    from dbt_semantic_interfaces.parsing.where_filter.where_filter_dimension import (
  File "/databricks/python/lib/python3.10/site-packages/dbt_semantic_interfaces/parsing/where_filter/where_filter_dimension.py", line 5, in <module>
    from typing_extensions import override
ImportError: cannot import name 'override' from 'typing_extensions' (/databricks/python/lib/python3.10/site-packages/typing_extensions.py)

Hope that helps to fix this issue.

thijs-nijhuis commented 10 months ago

Hi, any updates on this issue? We are running into the same thing. We are implementing Programmatic Invocation here as well and have big plans for it! To isolate this problem I use this small bit of Python code and simply run it from the Databricks Workspace on a cluster that only has dbt-databricks 1.6.6 installed:

print("Starting")

from dbt.cli.main import dbtRunner, dbtRunnerResult

dbt = dbtRunner()
cli_args = ['--version']
res: dbtRunnerResult = dbt.invoke(cli_args)

print("The End")

I have tested it with different LTS Databricks runtime versions (12.2 and 13.3), different Access modes (Shared and Single) and with or without installing typing_extenstions 4.7.1. All using dbt-databricks 1.6.6. These are the results:

DBX runtime Access mode typing extension Result
12.2 Single Not installed Perfect, no errors, no warnings
12.2 Shared n.a. It is not possible to pre-install libraries in 12.2 Shared
13.3 Single Not Installed Same error as mentions in previous post (ImportError: cannot import name 'override' from 'typing_extensions'). Also, the run outputs almost 200 of the same warnings/output lines: \:914: ImportWarning: ImportHookFinder.find_spec() not found; falling back to find_module()
13.3 Shared Not Installed Same result as on Single Access mode
13.3 Single Typing_extensions 4.7.1 installed The error is gone and the run succeeds. However, the almost 200 output/warning lines ('\:914: ImportWarning: ImportHookFinder.find_spec() not found; falling back to find_module()') are still there
13.3 Single Typing_extensions 4.7.1 installed Same result as on Single Access mode

My complete stack trace when I get the error looks like this:

ImportError: cannot import name 'override' from 'typing_extensions' (/databricks/python/lib/python3.10/site-packages/typing_extensions.py)
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
File <command-824890481590461>, line 3
      1 print("Starting")
----> 3 from dbt.cli.main import dbtRunner, dbtRunnerResult
      5 dbt = dbtRunner()
      6 cli_args = ['--version']

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/cli/__init__.py:1
----> 1 from .main import cli as dbt_cli  # noqa

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/cli/main.py:14
      6 import click
      7 from click.exceptions import (
      8     Exit as ClickExit,
      9     BadOptionUsage,
     10     NoSuchOption,
     11     UsageError,
     12 )
---> 14 from dbt.cli import requires, params as p
     15 from dbt.cli.exceptions import (
     16     DbtInternalException,
     17     DbtUsageException,
     18 )
     19 from dbt.contracts.graph.manifest import Manifest

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/cli/requires.py:3
      1 import dbt.tracking
      2 from dbt.version import installed as installed_version
----> 3 from dbt.adapters.factory import adapter_management, register_adapter
      4 from dbt.flags import set_flags, get_flag_dict
      5 from dbt.cli.exceptions import (
      6     ExceptionExit,
      7     ResultExit,
      8 )

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/adapters/factory.py:8
      5 from pathlib import Path
      6 from typing import Any, Dict, List, Optional, Set, Type
----> 8 from dbt.adapters.base.plugin import AdapterPlugin
      9 from dbt.adapters.protocol import AdapterConfig, AdapterProtocol, RelationProtocol
     10 from dbt.contracts.connection import AdapterRequiredConfig, Credentials

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/adapters/base/__init__.py:6
      4 from dbt.contracts.connection import Credentials  # noqa: F401
      5 from dbt.adapters.base.meta import available  # noqa: F401
----> 6 from dbt.adapters.base.connections import BaseConnectionManager  # noqa: F401
      7 from dbt.adapters.base.relation import (  # noqa: F401
      8     BaseRelation,
      9     RelationType,
     10     SchemaSearchMap,
     11 )
     12 from dbt.adapters.base.column import Column  # noqa: F401

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/adapters/base/connections.py:35
     26 import dbt.exceptions
     27 from dbt.contracts.connection import (
     28     Connection,
     29     Identifier,
   (...)
     33     AdapterResponse,
     34 )
---> 35 from dbt.contracts.graph.manifest import Manifest
     36 from dbt.adapters.base.query_headers import (
     37     MacroQueryStringSetter,
     38 )
     39 from dbt.events import AdapterLogger

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/contracts/graph/manifest.py:27
     24 from typing_extensions import Protocol
     25 from uuid import UUID
---> 27 from dbt.contracts.graph.nodes import (
     28     BaseNode,
     29     Documentation,
     30     Exposure,
     31     GenericTestNode,
     32     GraphMemberNode,
     33     Group,
     34     Macro,
     35     ManifestNode,
     36     Metric,
     37     ModelNode,
     38     DeferRelation,
     39     ResultNode,
     40     SemanticModel,
     41     SourceDefinition,
     42     UnpatchedSourceDefinition,
     43 )
     44 from dbt.contracts.graph.unparsed import SourcePatch, NodeVersion, UnparsedVersion
     45 from dbt.contracts.graph.manifest_upgrade import upgrade_manifest_json

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/contracts/graph/nodes.py:61
     59 from dbt_semantic_interfaces.references import MetricReference as DSIMetricReference
     60 from dbt_semantic_interfaces.type_enums import MetricType, TimeGranularity
---> 61 from dbt_semantic_interfaces.parsing.where_filter_parser import WhereFilterParser
     63 from .model_config import (
     64     NodeConfig,
     65     SeedConfig,
   (...)
     72     SemanticModelConfig,
     73 )
     76 # =====================================================================
     77 # This contains the classes for all of the nodes and node-like objects
     78 # in the manifest. In the "nodes" dictionary of the manifest we find
   (...)
     95 # Various parent classes and node attribute classes
     96 # ==================================================

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt_semantic_interfaces/parsing/where_filter_parser.py:14
      7 from dbt_semantic_interfaces.call_parameter_sets import (
      8     FilterCallParameterSets,
      9     ParseWhereFilterException,
     10 )
     11 from dbt_semantic_interfaces.parsing.where_filter.parameter_set_factory import (
     12     ParameterSetFactory,
     13 )
---> 14 from dbt_semantic_interfaces.parsing.where_filter.where_filter_dimension import (
     15     WhereFilterDimensionFactory,
     16 )
     17 from dbt_semantic_interfaces.parsing.where_filter.where_filter_entity import (
     18     WhereFilterEntityFactory,
     19 )
     20 from dbt_semantic_interfaces.parsing.where_filter.where_filter_time_dimension import (
     21     WhereFilterTimeDimensionFactory,
     22 )

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt_semantic_interfaces/parsing/where_filter/where_filter_dimension.py:5
      1 from __future__ import annotations
      3 from typing import List, Optional, Sequence
----> 5 from typing_extensions import override
      7 from dbt_semantic_interfaces.errors import InvalidQuerySyntax
      8 from dbt_semantic_interfaces.protocols.protocol_hint import ProtocolHint
leo-schick commented 10 months ago

@thijs-nijhuis-shell I guess it would be helpful if you could find out for each runtime / mode what version of package typing_extensions is used. e.g. run pip freeze | grep typing_extensions and put the version number in a table as given above. Maybe the python version would help as well (python --version). I guess we should boil that down since here we are in the dbt-core repo, not the Databricks repo.

frankivo commented 10 months ago

I guess we should boil that down since here we are in the dbt-core repo, not the Databricks repo.

But shouldn't dbt-core depend on the right typing_extensions as it appearently needs it?

thijs-nijhuis commented 10 months ago

pip freeze | grep typing_extensions

I think I found something interesting when trying to fetch the versions.

First off, there is no difference in type_extensions or python version between Single or Shared clusters. And, obviously, in my runs where I pre-install 4.7.1, that is the type_extensions version pip returns as well. So the only two situation to distinguish are runtime 12.2 vs 13.3.

However, what does make a difference is whether you have dbt-core (or dbt-databricks, same result) pre-installed or not. These are the results:

DBX runtime dbt-core installed Python version Typing_extensions version
12.2 no 3.9.5 4.1.1
13.3 no 3.10.12 4.3.0
12.2 yes (1.6.6) 3.9.5 4.8.0
13.3 yes (1.6.6) 3.10.12 4.3.0

So on 12.2, installing dbt-core also installs a new version op type_extensions. On 13.3 it does not. My guess is that somewhere in the dependencies of dbt-core (or any of its packages) it says type_extensions >= 4.2 or something like that triggering an update on 12.2 but not 13.3. Does that make sense?

BTW, when I look at the requirements of dbt-core itself it says >=3.7.4. So maybe one of the other packages it uses caused the upgrade to 4.8.0 on 12.2.

thijs-nijhuis commented 10 months ago

Any idea on how to do that? I tried running the python script that's in my first post on a cluster with 4.8.0 installed instead of 4.7.1. Same result. No errors, a lot of warnings: ':914: ImportWarning: ImportHookFinder.find_spec() not found; falling back to find_module()'

leo-schick commented 10 months ago

Can you try to check it with pipdeptree? https://pypi.org/project/pipdeptree/

As workaround maybe it works to force pip to install the 4.8.0 version of typing_extensions after you installed dbt-core / dbt-databricks.

ghost commented 10 months ago

Hey @thijs-nijhuis-shell, I don't know your setup in Databricks, but if you're using code inside dbx notebooks, please also try this command before running any dbt-api call (restarting python process)


# Restart python process to ensure new installed packages are used
dbutils.library.restartPython()

TBH, dunno why, but it's required if you're installing deps inside the notebook

Fatal1ty commented 10 months ago

The root cause of this issue comes from dbt-semantic-interfaces:

leo-schick commented 10 months ago

@Almaz-Murzabekov I guess this should not be required when you install packages with the %pip magic or the init cluster script.

QMalcolm commented 10 months ago

The root cause of this issue comes from dbt-semantic-interfaces:

@Fatal1ty thank you for opening the PR on dbt-semantic-interfaces!

I want to be clear on what the PR in dbt-semantic-interfaces will be resolving though. It won't resolve the original issue that was seen in dbt-core 1.5 because dbt-semantic-interfaces wasn't in dbt-core 1.5. However it should help resolve the similar issue seen in 1.6 brought up in September.

dbeatty10 commented 10 months ago

As @QMalcolm is calling out, it appears that there are two similar error messages with separate root causes and fixes:

  1. v1.5 ImportError: cannot import name 'Unpack' from 'typing_extensions'
  2. v1.6+ ImportError: cannot import name 'override' from 'typing_extensions'

Based on https://github.com/dbt-labs/dbt-core/issues/7828#issuecomment-1780718825, it sounds like the 2nd one will be fixed by https://github.com/dbt-labs/dbt-semantic-interfaces/issues/193

Based on https://github.com/dbt-labs/dbt-core/issues/7828#issuecomment-1583143556, it originally sounded like the 1st one could be solved by a different version of mashumaro here, but then https://github.com/dbt-labs/dbt-core/issues/7828#issuecomment-1591748406 said otherwise.

So if I'm reading things correctly, we have a fix for dbt-core 1.6+, but do not yet have a path forward for the original issue reported for 1.5.

QMalcolm commented 10 months ago

I'm not convinced it will fully fix 1.6+ either unfortunately 😞 dbt-semantic-interfaces wasn't previously excluding typing-extensions >= 4.4, specifically it's version spec for that dependency was ~4.0 which is essentially >=4.0, <5. If the error is happening on a clean install, then a version of typing extensions <4.4 being installed means that there is another dependency that is restricting the typing-extensions version

tlento commented 10 months ago

@QMalcolm and I have done some further digging, and this comment https://github.com/dbt-labs/dbt-core/issues/7828#issuecomment-1780640653 was extremely helpful in our investigation, so big thanks to @thijs-nijhuis-shell

What appears to be happening is Databricks in particular is configuring its initial python environment with typing-extensions set to 4.1/4.3, and a subsequent install of dbt-core 1.6 will not update this because 1.6 is set to require ~=4.0 (by way of dbt-semantic-interfaces).

Since everything works if you install dbt-core (or typing-extensions 4.7.1) first, this suggests databricks is actually perfectly happy with 4.7 or whatever.

However, that databricks 13.3 overwrites the typing-extensions module with 4.3 (while 12.2 leaves that well enough alone) is a bit of a cause for concern. Like, why would it do that?

We're ready to backport the dbt-semantic-interfaces change to address the issue with 1.6, but we'd like to confirm that Databricks 13.3 is happy with a later version of typing_extensions before we merge and deploy. We'll update here once we're done.

tlento commented 10 months ago

Worth noting - there is nothing I can find in dbt-core's existing dependency tree requiring typing-extensions > 4.1.0 OR restricting it to < 4.4. Mashumaro sets the current floor in production at >=4.1.0, so if you install databricks first there's no real reason for pip to update typing-extensions, and if you install dbt-core first there's no real reason for databricks to clobber typing-extensions on 13.3 but not on 12.2.

leo-schick commented 10 months ago

I think it would make sence then to move this issue over to the Databricks project, or? @benc-db is that something you could help with to investigate why pip install behavior differs between the Databricks Runtime 12.2 and 13.3?

benc-db commented 10 months ago

I can file a ticket; I work on the adapter specifically, so I don't have a lot of insight into the DBR.

tlento commented 10 months ago

The databricks 13.3 runtime appears happy with typing_extensions >= 4.3, so we will merge the relevant fixes into the dbt-semantic-interfaces package today, with plans to deploy the backport tomorrow.

This won't affect existing installations of dbt-core or dbt-databricks, as an update to typing-extensions will not automatically trigger, but new installations should update versions accordingly. Worth noting, the databricks runtime install might still clobber the typing-extensions version on pre-installed dbt-databricks packages, so it could be necessary to install/update dbt-databricks (or even typing-extensions) after the runtime has spun up.

I'll update here when the backport is deployed, as that should address most issues people are encountering with dbt-core 1.6.

thijs-nijhuis commented 10 months ago

The databricks 13.3 runtime appears happy with typing_extensions >= 4.3, so we will merge the relevant fixes into the dbt-semantic-interfaces package today, with plans to deploy the backport tomorrow.

This won't affect existing installations of dbt-core or dbt-databricks, as an update to typing-extensions will not automatically trigger, but new installations should update versions accordingly. Worth noting, the databricks runtime install might still clobber the typing-extensions version on pre-installed dbt-databricks packages, so it could be necessary to install/update dbt-databricks (or even typing-extensions) after the runtime has spun up.

I'll update here when the backport is deployed, as that should address most issues people are encountering with dbt-core 1.6.

Thanks! I will make sure to test it on our end when available.

tlento commented 10 months ago

dbt-semantic-interfaces 0.2.3 has been deployed: https://pypi.org/project/dbt-semantic-interfaces/0.2.3/

At this point, doing a pip upgrade (for existing installations) or doing a fresh install of dbt-core 1.6 or a corresponding adapter package should pin the typing-extensions version to something past 4.4.0.

As mentioned earlier, this will NOT fix issues with 1.5 installs, and even on 1.6 any installation that clobbers the typing_extensions version may still cause problems, but at least if you install dbt-core it should now update typing-extensions appropriately.

Thanks to @Fatal1ty for submitting the quick patch on the DSI side and everyone here (and especially @thijs-nijhuis-shell ) for the detailed info on what was broken.

thijs-nijhuis commented 10 months ago

@tlento, first off, big thanks for the quick fix! I tested the new version. Ran the same databricks job I used before which all use job clustered so no chance of any pre-installed leftovers.

The good news is that the error is indeed gone! So now, when running on DBX 13.3, I can run my test script without any errors and without installing any typing-extensions version. When I check the typing-extensions after installing dbt-core 1.6.6 it says 4.8.0, just like on DBX 12.2. The bad news is that the warning I mentioned are still there. So when I run my small script mention in comment 7828, the output is:

Starting
<frozen importlib._bootstrap>:914: ImportWarning: ImportHookFinder.find_spec() not found; falling back to find_module()
...<<this exact warning repeats almost 200 times>>...
<frozen importlib._bootstrap>:914: ImportWarning: ImportHookFinder.find_spec() not found; falling back to find_module()
Core:
  - installed: 1.6.6
  - latest:    1.6.6 - Up to date!

Plugins:
  - spark:      1.6.0 - Up to date!
  - databricks: 1.6.6 - Up to date!

The End

I don't get these warning when running on DBX 12.2. I don't know if its related to package versioning or if it's caused by the fact that 12.2 uses a different python version (3.9.5) than 13.3 (3.10.12). And I am more than happy to open up a new issue for this if you prefer.

To check the package versioning, I ran the pipdeptree tool that @leo-schick suggested on both 12.2 and 13.3 after installing dbt-databricks 1.6.6. The first list below is of 12.2 and the second for 13.3. I marked each row that uses a different versions with a * at the start in both lists. Hope this might help someone spot something odd.

12.2 deptreelist

dbt-core==1.6.6
├── agate [required: ~=1.7.0, installed: 1.7.1]
│   ├── Babel [required: >=2.0, installed: 2.13.1]
│   ├── isodate [required: >=0.5.4, installed: 0.6.1]
│   │   └── six [required: Any, installed: 1.16.0]
│   ├── leather [required: >=0.3.2, installed: 0.3.4]
│   │   └── six [required: >=1.6.1, installed: 1.16.0]
│   ├── parsedatetime [required: >=2.1,!=2.5, installed: 2.6]
│   ├── python-slugify [required: >=1.2.1, installed: 8.0.1]
│   │   └── text-unidecode [required: >=1.3, installed: 1.3]
│   └── pytimeparse [required: >=1.1.5, installed: 1.1.8]
*├── cffi [required: >=1.9,<2.0.0, installed: 1.15.0]
│   └── pycparser [required: Any, installed: 2.21]
├── click [required: <9, installed: 8.0.4]
├── colorama [required: >=0.3.9,<0.5, installed: 0.4.6]
├── dbt-extractor [required: ~=0.4.1, installed: 0.4.1]
├── dbt-semantic-interfaces [required: ~=0.2.0, installed: 0.2.3]
│   ├── click [required: >=7.0,<9.0, installed: 8.0.4]
│   ├── importlib-metadata [required: ~=6.0, installed: 6.8.0]
*│   │   └── zipp [required: >=0.5, installed: 3.17.0]
│   ├── Jinja2 [required: ~=3.0, installed: 3.1.2]
│   │   └── MarkupSafe [required: >=2.0, installed: 2.0.1]
*│   ├── jsonschema [required: ~=4.0, installed: 4.4.0]
│   │   ├── attrs [required: >=17.4.0, installed: 21.4.0]
│   │   └── pyrsistent [required: >=0.14.0,!=0.17.2,!=0.17.1,!=0.17.0, installed: 0.18.0]
*│   ├── more-itertools [required: ~=8.0, installed: 8.14.0]
*│   ├── pydantic [required: ~=1.10, installed: 1.10.13]
│   │   └── typing-extensions [required: >=4.2.0, installed: 4.8.0]
│   ├── python-dateutil [required: ~=2.0, installed: 2.8.2]
│   │   └── six [required: >=1.5, installed: 1.16.0]
│   ├── PyYAML [required: ~=6.0, installed: 6.0.1]
│   └── typing-extensions [required: ~=4.4, installed: 4.8.0]
├── hologram [required: ~=0.0.16, installed: 0.0.16]
│*   ├── jsonschema [required: >=3.0, installed: 4.4.0]
│   │   ├── attrs [required: >=17.4.0, installed: 21.4.0]
│   │   └── pyrsistent [required: >=0.14.0,!=0.17.2,!=0.17.1,!=0.17.0, installed: 0.18.0]
│   └── python-dateutil [required: >=2.8,<2.9, installed: 2.8.2]
│       └── six [required: >=1.5, installed: 1.16.0]
├── idna [required: >=2.5,<4, installed: 3.3]
├── isodate [required: >=0.6,<0.7, installed: 0.6.1]
│   └── six [required: Any, installed: 1.16.0]
├── Jinja2 [required: ~=3.1.2, installed: 3.1.2]
│   └── MarkupSafe [required: >=2.0, installed: 2.0.1]
├── Logbook [required: >=1.5,<1.6, installed: 1.5.3]
├── mashumaro [required: ~=3.8.1, installed: 3.8.1]
│   └── typing-extensions [required: >=4.1.0, installed: 4.8.0]
├── minimal-snowplow-tracker [required: ~=0.0.2, installed: 0.0.2]
*│   ├── requests [required: >=2.2.1,<3.0, installed: 2.27.1]
*│   │   ├── certifi [required: >=2017.4.17, installed: 2021.10.8]
│   │   ├── charset-normalizer [required: ~=2.0.0, installed: 2.0.4]
│   │   ├── idna [required: >=2.5,<4, installed: 3.3]
*│   │   └── urllib3 [required: >=1.21.1,<1.27, installed: 1.26.9]
│   └── six [required: >=1.9.0,<2.0, installed: 1.16.0]
├── networkx [required: >=2.3,<4, installed: 3.2.1]
├── packaging [required: >20.9, installed: 21.3]
*│   └── pyparsing [required: >=2.0.2,!=3.0.5, installed: 3.0.4]
├── pathspec [required: >=0.9,<0.12, installed: 0.9.0]
├── protobuf [required: >=4.0.0, installed: 4.24.4]
*├── pytz [required: >=2015.7, installed: 2021.3]
├── PyYAML [required: >=6.0, installed: 6.0.1]
*├── requests [required: <3.0.0, installed: 2.27.1]
*│   ├── certifi [required: >=2017.4.17, installed: 2021.10.8]
│   ├── charset-normalizer [required: ~=2.0.0, installed: 2.0.4]
│   ├── idna [required: >=2.5,<4, installed: 3.3]
*│   └── urllib3 [required: >=1.21.1,<1.27, installed: 1.26.9]
├── sqlparse [required: >=0.2.3,<0.5, installed: 0.4.4]
├── typing-extensions [required: >=3.7.4, installed: 4.8.0]
*└── urllib3 [required: ~=1.0, installed: 1.26.9]

13.3 deptreelist

dbt-core==1.6.6
├── agate [required: ~=1.7.0, installed: 1.7.1]
│   ├── Babel [required: >=2.0, installed: 2.13.1]
│   ├── isodate [required: >=0.5.4, installed: 0.6.1]
│   │   └── six [required: Any, installed: 1.16.0]
│   ├── leather [required: >=0.3.2, installed: 0.3.4]
│   │   └── six [required: >=1.6.1, installed: 1.16.0]
│   ├── parsedatetime [required: >=2.1,!=2.5, installed: 2.6]
│   ├── python-slugify [required: >=1.2.1, installed: 8.0.1]
│   │   └── text-unidecode [required: >=1.3, installed: 1.3]
│   └── pytimeparse [required: >=1.1.5, installed: 1.1.8]
*├── cffi [required: >=1.9,<2.0.0, installed: 1.15.1]
│   └── pycparser [required: Any, installed: 2.21]
├── click [required: <9, installed: 8.0.4]
├── colorama [required: >=0.3.9,<0.5, installed: 0.4.6]
├── dbt-extractor [required: ~=0.4.1, installed: 0.4.1]
├── dbt-semantic-interfaces [required: ~=0.2.0, installed: 0.2.3]
│   ├── click [required: >=7.0,<9.0, installed: 8.0.4]
│   ├── importlib-metadata [required: ~=6.0, installed: 6.8.0]
*│   │   └── zipp [required: >=0.5, installed: 1.0.0]
│   ├── Jinja2 [required: ~=3.0, installed: 3.1.2]
│   │   └── MarkupSafe [required: >=2.0, installed: 2.0.1]
*│   ├── jsonschema [required: ~=4.0, installed: 4.16.0]
│   │   ├── attrs [required: >=17.4.0, installed: 21.4.0]
│   │   └── pyrsistent [required: >=0.14.0,!=0.17.2,!=0.17.1,!=0.17.0, installed: 0.18.0]
*│   ├── more-itertools [required: ~=8.0, installed: 8.10.0]
*│   ├── pydantic [required: ~=1.10, installed: 1.10.6]
│   │   └── typing-extensions [required: >=4.2.0, installed: 4.8.0]
│   ├── python-dateutil [required: ~=2.0, installed: 2.8.2]
│   │   └── six [required: >=1.5, installed: 1.16.0]
│   ├── PyYAML [required: ~=6.0, installed: 6.0.1]
│   └── typing-extensions [required: ~=4.4, installed: 4.8.0]
├── hologram [required: ~=0.0.16, installed: 0.0.16]
*│   ├── jsonschema [required: >=3.0, installed: 4.16.0]
│   │   ├── attrs [required: >=17.4.0, installed: 21.4.0]
│   │   └── pyrsistent [required: >=0.14.0,!=0.17.2,!=0.17.1,!=0.17.0, installed: 0.18.0]
│   └── python-dateutil [required: >=2.8,<2.9, installed: 2.8.2]
│       └── six [required: >=1.5, installed: 1.16.0]
├── idna [required: >=2.5,<4, installed: 3.3]
├── isodate [required: >=0.6,<0.7, installed: 0.6.1]
│   └── six [required: Any, installed: 1.16.0]
├── Jinja2 [required: ~=3.1.2, installed: 3.1.2]
│   └── MarkupSafe [required: >=2.0, installed: 2.0.1]
├── Logbook [required: >=1.5,<1.6, installed: 1.5.3]
├── mashumaro [required: ~=3.8.1, installed: 3.8.1]
│   └── typing-extensions [required: >=4.1.0, installed: 4.8.0]
├── minimal-snowplow-tracker [required: ~=0.0.2, installed: 0.0.2]
*│   ├── requests [required: >=2.2.1,<3.0, installed: 2.28.1]
*│   │   ├── certifi [required: >=2017.4.17, installed: 2022.9.14]
│   │   ├── charset-normalizer [required: >=2,<3, installed: 2.0.4]
│   │   ├── idna [required: >=2.5,<4, installed: 3.3]
*│   │   └── urllib3 [required: >=1.21.1,<1.27, installed: 1.26.11]
│   └── six [required: >=1.9.0,<2.0, installed: 1.16.0]
├── networkx [required: >=2.3,<4, installed: 3.2.1]
├── packaging [required: >20.9, installed: 21.3]
*│   └── pyparsing [required: >=2.0.2,!=3.0.5, installed: 3.0.9]
├── pathspec [required: >=0.9,<0.12, installed: 0.9.0]
├── protobuf [required: >=4.0.0, installed: 4.24.4]
*├── pytz [required: >=2015.7, installed: 2022.1]
├── PyYAML [required: >=6.0, installed: 6.0.1]
*├── requests [required: <3.0.0, installed: 2.28.1]
*│   ├── certifi [required: >=2017.4.17, installed: 2022.9.14]
│   ├── charset-normalizer [required: >=2,<3, installed: 2.0.4]
│   ├── idna [required: >=2.5,<4, installed: 3.3]
*│   └── urllib3 [required: >=1.21.1,<1.27, installed: 1.26.11]
├── sqlparse [required: >=0.2.3,<0.5, installed: 0.4.4]
├── typing-extensions [required: >=3.7.4, installed: 4.8.0]
*└── urllib3 [required: ~=1.0, installed: 1.26.11]
Hangzhi commented 10 months ago

@thijs-nijhuis-shell

The warnings <frozen importlib._bootstrap>:914: ImportWarning: ImportHookFinder.find_spec() not found; falling back to find_module(), is because Databricks Runtime 13.3 default Python version is 3.10, and Python 3.10 is pushing find_module() deprecation. Some dependent packages may still use the legacy find_module() methods.

What’s New In Python 3.10 — Python 3.10.13 documentation

Specifically, find_loader()/find_module() (superseded by find_spec()), load_module() (superseded by exec_module()), module_repr() (which the import system takes care of for you), the package attribute (superseded by spec.parent), the loader attribute (superseded by spec.loader), and the cached attribute (superseded by spec.cached) will slowly be removed (as well as other classes and methods in importlib). ImportWarning and/or DeprecationWarning will be raised as appropriate to help identify code which needs updating during this transition.

tlento commented 10 months ago

@thijs-nijhuis-shell if it's always ImportHookFinder that may be due to the newrelic python agent. Upgrading their package may resolve it. https://github.com/newrelic/newrelic-python-agent/issues/436

It doesn't appear that dbt relies on this anywhere, as a clean install of dbt-databricks does not bundle that package, nor can I find any evidence of it (or the ImportHookFinder) being referenced in vendored code within my venv (although I admit I'm not very good at that sort of investigation).

thijs-nijhuis commented 10 months ago

Hi @Hangzhi and @tlento, thanks for the feedback.

First off, I ran my sample script on the newly release dbt-core 1.7.0 today but I got the old 'typing_extension' error on DBX 13.3. So I think fix didn't make it into that release. I don't have any issues when using dbt-core 1.6.6.

Second, the warnings. As you are probably aware, I am not a seasoned python programmer by any standard so I am struggling a bit on debugging this one. So I'll just write down what I tried (some of which might not make any sense at all) and hope this helps a more knowledgeable person think of something.

  1. I have expanded my debug code with some more imports (datetime and type_extensions) and place debug lines in between. The other impots did not yield any warnings and it is clear that the warnings appear when executing this line: from dbt.cli.main import dbtRunner . So I assume that warnings come from something inside that package or any 3rd party packages it might need. Right?
  2. I looked for 'ImportHookFinder' through all of the code in my venv. Only found it in the 'wrapt' package. Not sure why I have it in my venv. It is not on the dbx cluster when I run it. As a doublecheck, I did install the latest on the job cluster before running but that had no effect.
  3. I installed the latest version of newrelic (9.1.1) before running but no effect
  4. Googling a bit got be to this issue 'https://github.com/ansible/ansible-lint/issues/1919'. I know the message is different but tried to install the latest of ansible and ansible-core anyway. No effect.
  5. When I run my script on a cluster with dbt-core 1.6.6 on it (so not within a job) I get the message as well. If I run it a second time the warnings are gone. If I 'detach & re-attach' the script and then run it, the warnings are back. But I guess that only means the modules are loaded once, right?

I think I am going to implement this workaround for now but hope we can find a proper fix for this at some point:

with warnings.catch_warnings():
    warnings.simplefilter("error", ImportWarning)
    try:
        from dbt.cli.main import dbtRunner, dbtRunnerResult
    except ImportWarning:
        pass
    except:
        raise

Edit: This workaround doesn't work as this will stop importing the dbt package. See comments below.

tlento commented 10 months ago

First off, I ran my sample script on the newly release dbt-core 1.7.0 today but I got the old 'typing_extension' error on DBX 13.3. So I think fix didn't make it into that release. I don't have any issues when using dbt-core 1.6.6.

🤦 sorry, I forgot to backport and deploy the change to dbt-semantic-interfaces to the 0.4 branch so 1.7.0 is still using the old dependency. The fix is up in https://github.com/dbt-labs/dbt-semantic-interfaces/pull/205 and should be deployed early next week.

Regarding the warnings, I can't repro them at all on a local install of dbt-databricks~=1.6 into a clean virtual environment, so I suspect they have something to do with either the databricks 13.3 runtime configuration itself, or the specific python environment setup you're using there.

Note if it is due to wrapt, that is packaged with the databricks 13.3 runtime: https://docs.databricks.com/en/release-notes/runtime/13.3lts-ml.html#python-libraries-on-cpu-clusters

tlento commented 10 months ago

The fix for dbt-semantic-interfaces has been backported to the 0.4.latest branch and deployed as v0.4.1. This should address the import issues described here for dbt-core 1.7.x, although a clean reinstall or manual upgrade of the dbt-semantic-interfaces package might be necessary.

The problem originally raised in this issue does not appear to be due to a dependency mismatch, as the Unpack type was added to typing-extensions with version 4.1, and mashumaro 3.6 (which is where dbt-core 1.5 is currently pinned) requires typing-extensions >= 4.1

I believe the problem was once again with the way databricks runtime installs packages, which may lead it to clobber existing installs back to version pins in the requirements defined for the base dbx runtime environment. Ensuring that dbt is installed after the databricks runtime finishes initializing should now solve all of these issues.

I've added a PR to update the dbt-core dependency to reflect the reality that dbt-core now requires typing-extensions 4.4 or later. Once that merges I think it's safe to consider this issue closed, unless someone here runs into a new version of this same problem.

thijs-nijhuis commented 10 months ago

@tlento , thanks for fixing it for 1.7.x as well and all the effort you put into this.

Unfortunately, my workaround to suppress the warnings doesn't work after all. When promoting the warnings to errors the dbt library is not loaded (completely) which makes sense of course. For some reason, when I set the warnings to 'ignore' instead of error like the python documentation says the warnings are still shown. I have no idea how to solve or workaround this.

The link you added about 13.3 runtime points to 13.3LTS ML. We use 13.3LTS and that one doesn't have the wrapt package installed it seems. I also don't see it when doing the deptreelist.

I also don't get the warning (or the error that is solved now) when running locally. I am going to create a Issue in the dbt-databricks github for this.

tlento commented 10 months ago

@thijs-nijhuis-shell sounds good, thanks. If there's some weird interaction on those warnings that turns out to be something we need to fix on the dbt side please do open a separate issue. Hopefully you can get an update that makes those warnings stop.