SlideRuleEarth / sliderule

Server and client framework for on-demand science data processing in the cloud
https://slideruleearth.io
Other
26 stars 11 forks source link

Optimize GeoParquet file organization #160

Open jpswinski opened 1 year ago

jpswinski commented 1 year ago

Analyze size of parquet files and use different compression and chunk sizes... parquet-tools inspect shows negative compression ratios

jpswinski commented 1 year ago

Larger row groups were implemented and this significantly improved the size of the GeoParquet files. Still need to look into different encoding options.

jpswinski commented 1 year ago

https://parquet.apache.org/docs/file-format/data-pages/encodings/