What pain point is this feature intended to address? Please describe.
Spark's default for exporting individual files targets HDFS, creating an entire output folder, a single output file per partition, and a "SUCCESS" file to indicate that output is complete. This is annoying when all you want is a single CSV or JSON file
Describe the solution you'd like
Although we already collapse data down to a single partition, it would be helptul if, for single-file formats like CSV, JSON, XML, etc... and when the user is outputting to the local filesystem, or as a file artifact, the system were to output only a single file. For example, we could output to a temporary directory, and then copy the lone CSV, JSON, etc... file out to the target path.
What pain point is this feature intended to address? Please describe. Spark's default for exporting individual files targets HDFS, creating an entire output folder, a single output file per partition, and a "SUCCESS" file to indicate that output is complete. This is annoying when all you want is a single CSV or JSON file
Describe the solution you'd like
Although we already collapse data down to a single partition, it would be helptul if, for single-file formats like CSV, JSON, XML, etc... and when the user is outputting to the local filesystem, or as a file artifact, the system were to output only a single file. For example, we could output to a temporary directory, and then copy the lone CSV, JSON, etc... file out to the target path.