segmentio / parquet-go

Go library to read/write Parquet files
https://pkg.go.dev/github.com/segmentio/parquet-go
Apache License 2.0
341 stars 58 forks source link

Support for maintaining datasets in hive style partition format #144

Open himanshpal opened 2 years ago

himanshpal commented 2 years ago

Storing data in hive style partitions is very common use-case while writing data in columnar formats to object-stores. It would be great if the library adds support for the following features wrt to partition management

achille-roussel commented 2 years ago

Hello @himanshpal, thanks for starting this discussion!

While I agree with the high value of creating interoperability with existing systems, the changes you suggest seem to be quite broad in scope, careful design and maintenance considerations are required to ensure we would build effective solutions that we can maintain long term.

As a starting point, would you be able to provide documentation and/or datasets that would highlight the goals of the requests you brought up?