dbt-labs / dbt-spark

dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks
https://getdbt.com
Apache License 2.0
405 stars 227 forks source link

[ADAP-804] [Bug] published dbt-spark docker images are missing latest versions (1.5.x, 1.6.0) #873

Closed gschmutz closed 6 months ago

gschmutz commented 1 year ago

Is this a new bug in dbt-spark?

Current Behavior

For all other dbt adapters the latest versions are available as docker images in packages. For dbt-spark docker images are available but the latest versions are missing.

Expected Behavior

docker images for dbt-spark in version 1.5.0 and 1.6.0 are available.

Steps To Reproduce

n.a.

Relevant log output

No response

Environment

- OS:
- Python:
- dbt-core:
- dbt-spark:

Additional Context

No response

gschmutz commented 1 year ago

Just realized that the Docker build from the dbt-core project (https://github.com/dbt-labs/dbt-core/tree/main/docker) does not work for dbt-spark when using the PyHive version (or the default all)

docker build --tag my-dbt-spark:1.6.0 --target dbt-spark --build-arg dbt_core_ref=dbt-core@v1.6.0 --build-arg dbt_spark_ref=dbt-spark@v1.6.0 --build-arg dbt_spark_version=PyHive .

produces an error:

...
107.2   note: This error originates from a subprocess, and is likely not a problem with pip.
107.2   ERROR: Failed building wheel for sasl
107.2   Running setup.py clean for sasl
107.4   Building wheel for future (setup.py): started
108.0   Building wheel for future (setup.py): finished with status 'done'
108.0   Created wheel for future: filename=future-0.18.3-py3-none-any.whl size=492023 sha256=0dced4fde8484b7cf07f3ca722cbe787880c6fcb8eb27af37c82213dd20b48b8
108.0   Stored in directory: /tmp/pip-ephem-wheel-cache-pgg_8qmj/wheels/da/19/ca/9d8c44cd311a955509d7e13da3f0bea42400c469ef825b580b
108.0   Building wheel for PyHive (setup.py): started
108.3   Building wheel for PyHive (setup.py): finished with status 'done'
108.3   Created wheel for PyHive: filename=PyHive-0.6.5-py3-none-any.whl size=51554 sha256=b78987c7c11b9d3a18704d5339f9d1caf6221976e1f4c572f609fac9dd9da102
108.3   Stored in directory: /tmp/pip-ephem-wheel-cache-pgg_8qmj/wheels/cc/b2/8d/74115da1b8e1ee44544ec7870783c9fbf1127b66d296f6c4be
108.3   Building wheel for pure-sasl (setup.py): started
108.6   Building wheel for pure-sasl (setup.py): finished with status 'done'
108.6   Created wheel for pure-sasl: filename=pure_sasl-0.6.2-py3-none-any.whl size=11423 sha256=ef452afe0aeb515f2ad15f63e0df15ea5c620fef4e4f7d4413de8ebdb05b064e
108.6   Stored in directory: /tmp/pip-ephem-wheel-cache-pgg_8qmj/wheels/be/bd/15/23761a50b737a712aacac51c718906ce3563705a336d2c4ffc
108.6 Successfully built pyspark thrift dbt-spark logbook minimal-snowplow-tracker future PyHive pure-sasl
108.6 Failed to build sasl
108.6 ERROR: Could not build wheels for sasl, which is required to install pyproject.toml-based projects
------
Dockerfile:104
--------------------
 102 |         /tmp/* \
 103 |         /var/tmp/*
 104 | >>> RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_spark_ref}#egg=dbt-spark[${dbt_spark_version}]"
 105 |
 106 |
--------------------
ERROR: failed to solve: process "/bin/sh -c python -m pip install --no-cache-dir \"git+https://github.com/dbt-labs/${dbt_spark_ref}#egg=dbt-spark[${dbt_spark_version}]\"" did not complete successfully: exit code: 1

It works fine if I'm using the ODBC spark version.

Update: I have the same problem locally (not in docker) if I switch from Python 3.10 to 3.11. So problem is related to https://github.com/dbt-labs/dbt-spark/issues/864

leo-schick commented 1 year ago

The issue here is related to the sasl package which does not work with python 3.11 anymore. To make this work, you need to install pyhive with extra hive_pure_sasl which uses pure-sasl instead of the sasl package. To make this work, dbt-spark should use pyhive[hive_pure_sasl] instead of just pyhive when installing dbt-spark[pyhive].

You can easily reproduce this issue by running pip install pyhive vs pip install pyhive[hive_pure_sasl] on a python 3.11 installation.

github-actions[bot] commented 6 months ago

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions[bot] commented 6 months ago

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.