ooni / data

OONI Data CLI and Pipeline v5
https://docs.ooni.org/data
8 stars 4 forks source link

Evaluate different data format for speeding up reprocessing #59

Open hellais opened 8 months ago

hellais commented 8 months ago

Reprocessing is currently not so fast, given that it's feeding from JSONL files. We should consider using something that's more performant like avro, msgpack or similar.

For historical purposes, here are some benchmarks I ran sometime ago while building OONI Data: