springml / spark-sftp

Spark connector for SFTP
Apache License 2.0
100 stars 98 forks source link

Write large dataframes to a directory #21

Open ssimeonov opened 6 years ago

ssimeonov commented 6 years ago

Is it possible to write large dataframes the typical way they are handled in big data: one directory per dataframe? For example, it doesn't make sense to write a dataframe with 8+ billion rows to a single CSV file.

samuel-pt commented 6 years ago

Currently we are writing in a single file. We can add a parameter to write the results into multiple files.

ssimeonov commented 6 years ago

That would be very helpful since the current approach only works for small data.