Closed lingeshr-db closed 3 months ago
Unfortunately I don't have access to Databricks. DataComPy is pretty agnostic with its Spark code. Assuming you are using the Pandas on Spark version it's all just vanilla code in that sense.
I have a PySpark SQL PR which is in review (#310) if you want to try that out.
Just checked the docs: https://docs.databricks.com/en/compute/access-mode-limitations.html#udf-limitations-for-unity-catalog-shared-access-mode
In Databricks Runtime 13.3 LTS and above, Python scalar UDFs and Pandas UDFs are supported. Other Python UDFs, including UDAFs, UDTFs, and Pandas on Spark are not supported.
Your env wont support the Pandas on Spark API implementation. You need to use either Legacy or the new SparkSQLCompare
once released (assuming that works ok).
We encountered an issue when using SparkCompare (datacompy-0.12.0) against tables registered with Unity Catalog on a Databricks DBR 13.3 (or any version >13.3) cluster.
The error message is:
py4j.security.Py4JSecurityException: Method public boolean org.apache.spark.sql.internal.CatalogImpl.dropTempView(java.lang.String) is not whitelisted on class class org.apache.spark.sql.internal.CatalogImpl
This is a known limitation when working with Unity Catalog registered tables in Databricks DBR versions, where not all functions are whitelisted specifically internal CatalogImpl APIs like dropTempView(). If a function is not whitelisted by default, it might be considered unsafe, meaning enforcing table ACLs or another mode of access control is not possible, and hence, it is blocked in the Unity Catalog.
As a workaround, the public API counterparts, such as spark.catalog.dropTempView(), should be used instead of the restricted internal APIs.
SparkCompare might have a dependency on more of such internal APIs and may not work (at least as of today) with Databricks Unity Catalog tables.
Please investigate this issue and consider updating SparkCompare to utilize the public API counterparts to ensure compatibility with Unity Catalog-enabled tables/environments. Thank you!
P.S: