Presently we support txt, avro, parquet, csv, and json.
By this PR, we will have support for xml files too.
https://github.com/databricks/spark-xml => Has support for Spark XML connectivity.
We have used this as a dependency to spark sftp so that we can read/write XML files from SFTP servers.
Presently we support basic read / write for XML files.
We mainly used the rowTag and rootTag params.
This is enough for basic read write.
We can enhance it in future with more parameters from spark XML.
Merged the changes. Thanks for creating this well documented, neatly coded PR
Will release the library to maven repository and spark-packages.org in next week
Overview of the PR:
Presently we support txt, avro, parquet, csv, and json. By this PR, we will have support for xml files too. https://github.com/databricks/spark-xml => Has support for Spark XML connectivity.
We have used this as a dependency to spark sftp so that we can read/write XML files from SFTP servers.
How the code was tested:
The code was tested using following statements.
val df = spark.read. format("com.springml.spark.sftp"). option("host", "SFTP-SERVER"). option("username", "SFTP-USER"). option("password", "****"). option("fileType", "xml"). option("rowTag", "YEAR").load("myxml.xml")
Out of scope:
Presently we support basic read / write for XML files. We mainly used the rowTag and rootTag params. This is enough for basic read write. We can enhance it in future with more parameters from spark XML.