fraugster / parquet-go

Go package to read and write parquet files. parquet is a file format to store nested data structures in a flat columnar data format. It can be used in the Hadoop ecosystem and with tools such as Presto and AWS Athena.
Apache License 2.0
287 stars 53 forks source link

parquet-tool - file statistics command #79

Open panamafrancis opened 2 years ago

panamafrancis commented 2 years ago

As a user i would like a command to aid in debugging parquet files. For instance I would like to obtain the following file stats in a single command:

panamafrancis commented 2 years ago

re. https://github.com/fraugster/parquet-go/issues/85 how can we provide an estimate of the uncompressed size?