owid / etl

A compute graph for loading and transforming OWID's data
https://docs.owid.io/projects/etl
MIT License
85 stars 23 forks source link

Turn on co-generation of Parquet files #3490

Open larsyencken opened 2 weeks ago

larsyencken commented 2 weeks ago

Background

By default, we generate the data catalog in feather format, which we benchmarked to be the fastest and most compact columnar format, slightly better than parquet. However parquet has emerged to be much more widely supported in the community, meaning that we would like to be generating it by default.

Task

larsyencken commented 2 weeks ago

@Marigold Would we need to trigger a new ETL epoch to get existing datasets regenerated?