sparkutils / quality

A Quality Spark DQ Library
https://sparkutils.github.io/quality/
Apache License 2.0
4 stars 2 forks source link

extension / optimisations do not work with hive layer on Databricks #25

Closed chris-twiner closed 1 year ago

chris-twiner commented 1 year ago
  1. global / permanent views do not tolerate extension functions
  2. even on temporary views the optimisations never get loaded

for the same cluster using the underlying files, as per the tests run on dbr, the optimisations work.

chris-twiner commented 1 year ago

dbr 13.1 shows create view works with as_uuid and the query optimimisers are correct - without photon.

The optimisers are working in photon but don't show up as push down filters but rather "RequiredDataFilters" under PhotonScan parquet.

chris-twiner commented 1 year ago

base builtIn fix works, further issues will be a separate issue or release

chris-twiner commented 1 year ago

extra delta test added as the extensions are not working when the delta file is on adls (with or without photon) via hive/sts, dataset with the same file directly reading from the same cluster works. Other clusters using dbfs files also work, tests are showing that oss delta works with the external file table/view.