Closed coperator closed 1 year ago
Is it a version mismatch issue? I used pyspark version 3.2.0, openjdk version 8, python version 3.9.8 and com.databricks:spark-xml_2.13:0.16.0
It is, a Scala version mismatch. You are probably using Scala 2.12, but have added the 2.13 artifact. Use _2.12
Ah, yes! Thank you for the quick reply. Easy to miss. Maybe it would be good to highlight the fact that the versions must match in the documentation of spark-xml. So, default of Spark and PySpark is currently still scala 2.12.
That's just true of any Scala dependency though, I think many people would know that and/or you'd find that with other libraries
In an openjdk docker image I installed python and, then, pyspark with pip and used
to load spark-xml. When calling
I get the following error:
What am I missing?