fiboa / cli

CLI for fiboa (validation, inspection, schema and file creation, etc.)
https://pypi.org/project/fiboa-cli/
Apache License 2.0
0 stars 6 forks source link

Change parquet compression #30

Closed cholmes closed 4 months ago

cholmes commented 5 months ago

Changed default compression from brotli to zstd. This enables the output to work by default with DuckDB, which is a decently important target.

Closes #29

m-mohr commented 4 months ago

I got much better compression with brotli in the German datasets. That's the reason it's set to brotli. Maybe we should provide an option? Ideally duckdb implements brotli anyway, basing decisions on a single implementation is not ideal. Is there an issue for brotli in duckdb?

Edit: maybe upvote https://github.com/duckdb/duckdb/discussions/5313 ;)

cholmes commented 4 months ago

Yeah, I think providing an option would be good. I agree basing decisions on a single implementation is not ideal, but I also think DuckDB is one of the better clients to do things in a 'cloud-native geospatial' way, and that it's good to support as many clients as possible. So I definitely lean towards zstd as the default, with brotli as an option.