databricks / spark-xml

XML data source for Spark SQL and DataFrames
Apache License 2.0
500 stars 226 forks source link

[Clean Up] Remove some duplicated code with Spark and use the ones directly from Spark Repo #596

Closed ericsun95 closed 2 years ago

ericsun95 commented 2 years ago

The src/main/scala/com/databricks/spark/xml/util/CompressionCodecs.scala and src/main/scala/com/databricks/spark/xml/util/ParseMode.scala are duplicated with the corresponding classes in the org.apache.spark.sql.catalyst.util. Clean this up to make sure it is up-to-date.

Next: One step to align this repo with the spark latest update is to update some places with new V1 interfaces (As DataSourceV2 requires to fail back to V1). For example, Row to InternalRow, etc. It would also help remove the duplicated class like PartialResultException.