dbt-labs / dbt-external-tables

dbt macros to stage external sources
https://hub.getdbt.com/dbt-labs/dbt_external_tables/latest/
Apache License 2.0
297 stars 119 forks source link

Fix spark odbc / databricks integration tests #156

Closed jeremyyeo closed 1 year ago

jeremyyeo commented 2 years ago

Describe the bug

The databricks integration tests are running into errors like:

23:35:24  Encountered an error:
Runtime Error
  Runtime Error
    Database Error
      failed to connect
23:35:24  Traceback (most recent call last):
  File "/home/dbt_test_user/project/venv/lib/python3.8/site-packages/dbt/adapters/spark/connections.py", line 436, in open
    conn = pyodbc.connect(connection_str, autocommit=True)
pyodbc.Error: ('HY000', '[HY000] [unixODBC][Simba][ODBC] (11560) Unable to locate SQLGetPrivateProfileString function. (11560) (SQLDriverConnect)')

This is due to pyodbc 4.0.34 (https://github.com/mkleehammer/pyodbc/issues/1079) - the fix is to pin it https://github.com/dbt-labs/dbt-spark/pull/398.

Steps to reproduce

I actually pulled the image (https://github.com/dbt-labs/dbt-external-tables/blob/main/.circleci/config.yml#L52) locally and ran it. If we install dbt-spark[ODBC] via:

https://github.com/dbt-labs/dbt-external-tables/blob/5f29f019f39b4fc9b58ca86b2f688793fd6b246e/run_test.sh#L12

image

If you then install pyodbc==4.0.32 then it comes right (alternatively pip install dbt-spark[ODBC]==1.2).

Expected results

Databricks tests to be passing.

Actual results

Databricks tests running into error shown above.

Screenshots and log output

https://app.circleci.com/pipelines/github/dbt-labs/dbt-external-tables/234/workflows/073054ce-0ebb-4491-9163-aaac7f98c1c0/jobs/1070

System information

The contents of your packages.yml file:

Which database are you using dbt with?

The output of dbt --version:

Core:
  - installed: 1.3.0-b2
  - latest:    1.2.1    - Ahead of latest version!

Plugins:
  - spark: 1.3.0b2 - Ahead of latest version!

The operating system you're using:

$ cat /etc/*-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.5 LTS"
NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.5 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

The output of python --version:

3.8.6

Additional context

Pretty straightforward fix to pin pyodbc (or actually pip install dbt-spark[ODBC]==1.2 would work too.