fiboa / cli

CLI for fiboa (validation, inspection, schema and file creation, etc.)
https://pypi.org/project/fiboa-cli/
Apache License 2.0
0 stars 7 forks source link

GeoParquet output doesn't work with DuckDB #29

Closed cholmes closed 6 months ago

cholmes commented 6 months ago

So unfortunately we're using brotli compression, which DuckDB doesn't support. Trying to open LFK-AKTI_EPSG25832.parquet from source cooperative or files I've converted seems to result in Error: Invalid Error: Unsupported compression codec "BROTLI". Supported options are uncompressed, gzip, snappy or zstd.

I think zstd is generally recommend, see https://twitter.com/kylebarron2/status/1691483893461839872

m-mohr commented 6 months ago

See also #30 and https://github.com/duckdb/duckdb/discussions/5313

m-mohr commented 6 months ago

Just as an example for de_sh on why brotli looks compelling:

Zipped GeoPackage: 72,3 MB Brotli GeoParquet: 57,8 MB ZSTD GeoParquet: 77,0 MB

cholmes commented 6 months ago

Yeah, I agree Brotli looks pretty awesome for geo data. With the france eurocrops dataset it's like 3.0gig vs 3.45gigs for brotli vs zstd.