springml / spark-sftp

Spark connector for SFTP
Apache License 2.0
100 stars 98 forks source link

Write operations use the part file name instead of the file name in save call #67

Open rcornell opened 5 years ago

rcornell commented 5 years ago

When writing dataframes with this library, I found that the file name written on the FTP server is the part file name used when writing the local temp file.

df.write .format("com.springml.spark.sftp") .option("host", host) .option("port", port) .option("username", userName) .option("password", password) .option("fileType", "csv") .save("/folder/some_file.csv")

I found that the issue is happening in ChannelSftp.java, around line 442 where "_dst" is set. Rather than using the given target, it winds up adding the part file name on to the destination.

So what gets written is /folder/some_file.csv/part-0000-......xyz.csv