dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.55k stars 1.45k forks source link

Dagster Serverless Deploy not working when importing dbt project #21862

Closed juraurai closed 5 months ago

juraurai commented 5 months ago

Dagster version

1.7.5

What's the issue?

When trying to import a dbt project on github into dagster cloud, I get the following error(s):

ERROR: launchpadlib 1.10.13 requires testresources, which is not installed.
ERROR: dbt-semantic-interfaces 0.4.4 has requirement importlib-metadata~=6.0, but you'll have importlib-metadata 7.1.0 which is incompatible.
ERROR: dbt-core 1.7.14 has requirement agate~=1.7.0, but you'll have agate 1.10.2 which is incompatible.
ERROR: grpcio-health-checking 1.63.0 has requirement protobuf<6.0dev,>=5.26.1, but you'll have protobuf 4.25.3 which is incompatible.
ERROR: dbt-common 1.0.4 has requirement agate<1.10,>=1.7.0, but you'll have agate 1.10.2 which is incompatible.
ERROR: dbt-snowflake 1.8.0 has requirement dbt-core>=1.8.0a1, but you'll have dbt-core 1.7.14 which is incompatible.
Installing collected packages: typing-extensions, msgpack, mashumaro, six, isodate, PyYAML, charset-normalizer, certifi, urllib3, idna, requests, minimal-snowplow-tracker, protobuf, text-unidecode, python-slugify, leather, pytz, Babel, parsedatetime, pytimeparse, agate, MarkupSafe, Jinja2, click, more-itertools, python-dateutil, annotated-types, pydantic-core, pydantic, pkgutil-resolve-name, rpds-py, attrs, referencing, zipp, importlib-resources, jsonschema-specifications, jsonschema, importlib-metadata, dbt-semantic-interfaces, pathspec, dbt-extractor, colorama, pycparser, cffi, networkx, logbook, sqlparse, dbt-core, orjson, fsspec, universal-pathlib, grpcio, grpcio-health-checking, humanfriendly, coloredlogs, watchdog, dagster-pipes, python-dotenv, setuptools, toposort, greenlet, sqlalchemy, croniter, structlog, tqdm, Mako, alembic, docstring-parser, tabulate, pytzdata, pendulum, mdurl, markdown-it-py, pygments, rich, filelock, dagster, shellingham, typer, sqlglotrs, sqlglot, dagster-dbt, pyjwt, asn1crypto, cryptography, sortedcontainers, tomlkit, platformdirs, pyOpenSSL, jeepney, SecretStorage, jaraco.classes, keyring, snowflake-connector-python, dbt-common, dbt-adapters, dbt-snowflake, pex, uritemplate, github3.py, wcwidth, prompt-toolkit, questionary, dagster-cloud-cli, dagster-cloud, dbt-sf-dagster
Successfully installed Babel-2.15.0 Jinja2-3.1.4 Mako-1.3.5 MarkupSafe-2.1.5 PyYAML-6.0.1 SecretStorage-3.3.3 agate-1.10.2 alembic-1.13.1 annotated-types-0.6.0 asn1crypto-1.5.1 attrs-23.2.0 certifi-2024.2.2 cffi-1.16.0 charset-normalizer-3.3.2 click-8.1.7 colorama-0.4.6 coloredlogs-14.0 croniter-2.0.5 cryptography-42.0.7 dagster-1.7.5 dagster-cloud-1.7.5 dagster-cloud-cli-1.7.5 dagster-dbt-0.23.5 dagster-pipes-1.7.5 dbt-adapters-1.1.1 dbt-common-1.0.4 dbt-core-1.7.14 dbt-extractor-0.5.1 dbt-semantic-interfaces-0.4.4 dbt-sf-dagster-0.0.1 dbt-snowflake-1.8.0 docstring-parser-0.16 filelock-3.14.0 fsspec-2024.3.1 github3.py-4.0.1 greenlet-3.0.3 grpcio-1.63.0 grpcio-health-checking-1.63.0 humanfriendly-10.0 idna-3.7 importlib-metadata-7.1.0 importlib-resources-6.4.0 isodate-0.6.1 jaraco.classes-3.4.0 jeepney-0.8.0 jsonschema-4.22.0 jsonschema-specifications-2023.12.1 keyring-24.3.1 leather-0.4.0 logbook-1.5.3 markdown-it-py-3.0.0 mashumaro-3.13 mdurl-0.1.2 minimal-snowplow-tracker-0.0.2 more-itertools-10.2.0 msgpack-1.0.8 networkx-3.1 orjson-3.10.3 parsedatetime-2.6 pathspec-0.11.2 pendulum-2.1.2 pex-2.3.1 pkgutil-resolve-name-1.3.10 platformdirs-4.2.2 prompt-toolkit-3.0.36 protobuf-4.25.3 pyOpenSSL-24.1.0 pycparser-2.22 pydantic-2.7.1 pydantic-core-2.18.2 pygments-2.18.0 pyjwt-2.8.0 python-dateutil-2.9.0.post0 python-dotenv-1.0.1 python-slugify-8.0.4 pytimeparse-1.1.8 pytz-2024.1 pytzdata-2020.1 questionary-2.0.1 referencing-0.35.1 requests-2.31.0 rich-13.7.1 rpds-py-0.18.1 setuptools-69.5.1 shellingham-1.5.4 six-1.16.0 snowflake-connector-python-3.10.0 sortedcontainers-2.4.0 sqlalchemy-2.0.30 sqlglot-23.15.8 sqlglotrs-0.2.5 sqlparse-0.5.0 structlog-24.1.0 tabulate-0.9.0 text-unidecode-1.3 tomlkit-0.12.5 toposort-1.10 tqdm-4.66.4 typer-0.12.3 typing-extensions-4.11.0 universal-pathlib-0.2.2 uritemplate-4.1.1 urllib3-1.26.18 watchdog-4.0.0 wcwidth-0.2.13 zipp-3.18.1
Traceback (most recent call last):
  File "/home/runner/.local/bin/dagster-dbt", line 5, in <module>
    from dagster_dbt.cli.app import app
  File "/home/runner/.local/lib/python3.8/site-packages/dagster_dbt/__init__.py", line 20, in <module>
    from .core import (
  File "/home/runner/.local/lib/python3.8/site-packages/dagster_dbt/core/__init__.py", line 5, in <module>
    from .resources_v2 import (
  File "/home/runner/.local/lib/python3.8/site-packages/dagster_dbt/core/resources_v2.py", line 53, in <module>
    from dbt.config import RuntimeConfig
  File "/home/runner/.local/lib/python3.8/site-packages/dbt/config/__init__.py", line 2, in <module>
    from .profile import Profile  # noqa
  File "/home/runner/.local/lib/python3.8/site-packages/dbt/config/profile.py", line 8, in <module>
    from dbt.clients.system import load_file_contents
  File "/home/runner/.local/lib/python3.8/site-packages/dbt/clients/system.py", line 18, in <module>
    from dbt.events.functions import fire_event
  File "/home/runner/.local/lib/python3.8/site-packages/dbt/events/__init__.py", line 1, in <module>
    from .adapter_endpoint import AdapterLogger  # noqa: F401
  File "/home/runner/.local/lib/python3.8/site-packages/dbt/events/adapter_endpoint.py", line 3, in <module>
    from dbt.events.functions import fire_event, EVENT_MANAGER
  File "/home/runner/.local/lib/python3.8/site-packages/dbt/events/functions.py", line 2, in <module>
    from dbt.events.base_types import BaseEvent, EventLevel, EventMsg
  File "/home/runner/.local/lib/python3.8/site-packages/dbt/events/base_types.py", line 4, in <module>
    from dbt.events import types_pb2

TypeError: Couldn't build proto file into descriptor pool: duplicate file name types.proto
Error: Process completed with exit code 1.

What did you expect to happen?

I would have expected to import the dbt project and run have the Serverless Deploy run without errors. I have done this before without these errors a couple of months ago.

How to reproduce?

  1. Create a dbt project on github, with a corresponding profiles.yml. We were using Snowflake as DWH, however I don't think this is correlated. Because the error also happens when creating a new dbt project using the jaffleshop example.

  2. Import the dbt project using the dbt import function in dagster cloud using serverless agent.

  3. Try to deploy the Serverless Agent. In the github actions deployment this error happens in the "prepare DBT project for deployment" step.

Deployment type

None

Deployment details

No custom configurations, just a dbt project on github and dagster cloud.

Additional information

I'm using the 1-month trial

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

maximearmstrong commented 5 months ago

Hey @juraurai - Thanks for reaching out. I was able to reproduce using the jaffleshop example, which resulted in the following error trace.

ERROR: launchpadlib 1.10.13 requires testresources, which is not installed.
ERROR: grpcio-health-checking 1.63.0 has requirement protobuf<6.0dev,>=5.26.1, but you'll have protobuf 4.25.3 which is incompatible.
ERROR: dbt-semantic-interfaces 0.4.4 has requirement importlib-metadata~=6.0, but you'll have importlib-metadata 7.1.0 which is incompatible.
ERROR: dbt-core 1.7.14 has requirement urllib3~=1.0, but you'll have urllib3 2.2.1 which is incompatible.
ERROR: dbt-duckdb 1.8.0 has requirement dbt-core>=1.8.0, but you'll have dbt-core 1.7.14 which is incompatible.

I managed to fix this by editing the setup.py file created by using the Import a dbt project function import in Dagster+. The setup.py file is added to the PR created by the process in the GitHub repo.

For the jaffleshop example, forcing dbt-duckdb<1.8 in setup.py fixed the problem. In your case, doing the same for Snowflake should temporarily fix your problem.

I think the problem comes from the version constraints of dbt-core which we use for our dagster-dbt library, see here - this will require a fix on our end.

juraurai commented 5 months ago

Thanks for the quick reply! This fixed it indeed.

FYI, I replaced dbt-snowflake with dbt-snowflake<1.8.0

maximearmstrong commented 5 months ago

@juraurai Glad to hear it worked! FYI, PR #21904 fixes the problem and should be used in production soon.