Closed balbarka closed 1 month ago
It looks like going against Azure directly requires additional permission configurations in Spark.
It looks like going against Azure directly requires additional permission configurations in Spark.
This issue appears to be related to Unity Catalog. Per the link above:
Unity Catalog ignores Spark configuration settings when accessing data managed by external locations.
A lot of teams working in the healthcare space use Unity Catalog to be compliant with various security requirements and regulations related to PHI. It would be great to be able to process MRF files in the same databricks workspace as the rest of your protected healthcare data. I would anticipate that this would be brought up in the upcoming webinar, Building a Lakehouse for Healthcare: Unlocking Price Transparency.
Hi @tadtenacious, agree on your assessment that it is related to an infrastructure setup with Databricks and Unity Catalog. Having multi environments available we code working fine in dbfs, s3a, abfss...
Plan for the workshop is to focus on the specific technical and functional challenges in regards to price transparency.
We'll revisit this right after the workshop to round out the issue and provide a resolution in case other folks run into it. Stay tuned!
Doing some further research with UC + External locations, it appears that structured streaming is supported for single use clusters in LTS 11.3.
I will tag this as a version upgrade needed to Spark 3.3.0 for LTS 11.3
UC does not seem to support customs Spark streaming issues.
Unable to use unity catalog external location path. Failing path example:
cluster: https://adb-8590162618558854.14.azuredatabricks.net/?o=8590162618558854#setting/clusters/1021-151431-hs1h9l81/configuration
exception: Failure to initialize configurationInvalid configuration value detected for fs.azure.account.key