great-expectations / great_expectations

Always know what to expect from your data.
https://docs.greatexpectations.io/
Apache License 2.0
9.71k stars 1.5k forks source link

[BUG] Unable to get both snowflake and databricks working with gx at the same time #9650

Closed satniks closed 2 weeks ago

satniks commented 4 months ago

I am not able to get both snowflake and databricks working with great-expectations 0.18.11 and 0.18.12. Could not find python packages setup which works for both of them.

I tried following:

  1. Installed following libraries for databricks "pip install great_expectations[sqlalchemy] "pip install great_expectations[databricks]"
  2. Wrote expectations for databricks data source and it works nicely.
  3. Then I Installed following library for snowflake "pip install great_expectations[snowflake]",
  4. Then I wrote expectations for snowflake which also works as expected. But it broke my databricks setup. When I executed my existing databricks expectations, I got following error:

    raise SQLDatasourceError( great_expectations.datasource.fluent.sql_datasource.SQLDatasourceError: Unable to create a SQLAlchemy engine due to the following exception: module 'sqlalchemy.types' has no attribute 'Uuid'

May be this is because installation of "great_expectations[sqlalchemy]" downgrades SQLAlchemy from 2.0.28 to 1.4.52.

Databricks and Snowflake are two important data sources which should work together with common python environment. Can someone provide "pip freeze" setup which works for both?

satniks commented 4 months ago

Following issue could be related: https://github.com/databricks/databricks-sql-python/issues/291

satniks commented 4 months ago

It seems snowflake-sqlalchemy team is going to add support for sqlalchemy 2.x and once it is done, there will be a common sqlalchemy version that will work with databricks. Not sure about the timelines though. I hope great-expectations 1.0 will support latest snowflake and databricks versions at the same time in common python environment.

https://github.com/snowflakedb/snowflake-sqlalchemy/issues/380

Kilo59 commented 4 months ago

@satniks Yes, as soon as the snowflake sqalalchemy package supports sqlalchemy 2, things should work. Until then, there's nothing we can do about it.

satniks commented 4 months ago

Thanks @Kilo59 for confirmation. I will wait for sqlalchemy 2 support in snowflake-sqlalchemy.

Till then, I will be using the workaround provided here. It works nicely.

https://github.com/snowflakedb/snowflake-sqlalchemy/issues/380#issuecomment-1470762025

Kilo59 commented 2 weeks ago

@satniks snowflake-sqlalchemy 1.6.1 just released, which supports SQLAlchemy 2. It should be possible to install both of these in the same environment now.

https://pypi.org/project/snowflake-sqlalchemy/1.6.1/#history