springml / spark-sftp

Spark connector for SFTP
Apache License 2.0
100 stars 98 forks source link

Add support to text files for read/write #32 #33

Closed bagopalan closed 5 years ago

bagopalan commented 6 years ago

What is the purpose of this pull request?

Currently we have support for json,avro,parquet and csv. By this we will have support for text files. We want to use spark-sftp in our following Gimel project. https://github.com/paypal/gimel Gimel provides unified Data API to access data from any storage like HDFS, GS, Alluxio, Hbase, Aerospike, BigQuery, Druid, Elastic, Teradata, Oracle, MySQL, etc.

How was this patch tested? 1) Placed a text file in SFTP. 2) Used the following code to read the file config and printed the content of DF. val df = spark.read. format("com.springml.spark.sftp"). option("host", "SFTP_SERVER"). option("username", "USER"). option("password", "xxxxx"). option("fileType", "txt"). option("header", "false"). load("config" 3) For write, used the following code to write to SFTP. df.write. format("com.springml.spark.sftp"). option("host", "SFTP_SERVER"). option("username", "USER"). option("password", "xxxxxx"). option("fileType", "txt"). option("hdfsTempLocation","/tmp/basu/"). save("newConfig.gz")

samuel-pt commented 6 years ago

@bagopalan - Code changes are fine. Could you please modify README to include txt file as well. Also please add a sample code to read txt file.

bagopalan commented 6 years ago

@samuel-pt , changed the README to have sample code to read text file. I already added sample text file in the resources which will be used in test classes.

samuel-pt commented 5 years ago

@bagopalan - Merged the changes. I will publish the release in next week