This PR adds a new writer type named parquet.SortingWriter which ensures that rows written to row groups are always ordered according to the sorting columns passed as configuration.
The sorting strategy uses an in-memory buffer which gets sorted then serialized to a row group; when the writer is flushed or closed, all the row groups are merged while maintaining the global order of rows using a k-way sort.
This PR adds a new writer type named
parquet.SortingWriter
which ensures that rows written to row groups are always ordered according to the sorting columns passed as configuration.The sorting strategy uses an in-memory buffer which gets sorted then serialized to a row group; when the writer is flushed or closed, all the row groups are merged while maintaining the global order of rows using a k-way sort.