Closed TheDataDexter closed 2 years ago
I think you're reporting vs DBR 11, not Spark 3.3.0 per se. That runs fine, or at least the test suites do. That's also not the most recent version, 0.15.0 is, though shouldn't make much difference.
I can't reproduce this on DBR 11 though. Cluster startup is fine after the library is installed too.
The error does not look directly related to spark-xml; it is unrelated to arpack. Are you sure it's not some other library you are installing?
Thank you for your feedbackl. I was able to solve the bug by upgrading to version 0.15.0. This article helped me understand how databricks goes about the management of maven libraries.
I don't think that's related. That is also just describing how libraries are handled generally in Maven, nothing specific to Databricks. I don't believe this is related to spark-xml
I am trying to update my Databricks runtime to the newest version (DBR 11.0). However, the spark-xml package is not being installed properly. On the older Databricks runtimes the package is installed with no problems.
MAVEN coordinates: com.databricks:spark-xml_2.12:0.14.0
DBR 11.0 configurations:
DBR 10.5 configurations:
Error Code:
DRIVER_LIBRARY_INSTALLATION_FAILURE.
Error Message:Library resolution failed because problem during retrieve of com.databricks#dbc-parent: java.lang.RuntimeException: Multiple artifacts of the module net.sourceforge.f2j#arpack_combined_all;0.1 are retrieved to the same file! Update the retrieve pattern to fix this error.