awslabs / deequ

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Apache License 2.0
3.32k stars 539 forks source link

Update Deequ to Spark 3.4 #481

Closed jklap closed 1 year ago

jklap commented 1 year ago

We are currently using Databricks w/Deequ using an older Databricks runtime that has Spark 3.3.0 in it.

Unfortunately we need to update to a more recent Databricks runtime due some feature updates -- but that means we are moving to Spark 3.4.0.

Please consider this a request to update Deequ to support Spark 3.4.0

eycho-am commented 1 year ago

Merged PR #505 to add Spark 3.4.0 support

chenliu0831 commented 1 year ago

Could we release this to Maven?