azavea / osmesa

OSMesa is an OpenStreetMap processing stack based on GeoTrellis and Apache Spark
Apache License 2.0
80 stars 26 forks source link

Move aggregated stats into a JSON column #140

Closed mojodna closed 4 years ago

mojodna commented 5 years ago

Adding a new statistic currently involves modifying the batch aggregator, the streaming aggregator, the database schema, and osmesa-stat-server.

If we move aggregations into a JSON column, we eliminate the need to change the database schema each time (which usually requires doing a backfill). We also gain a path to simplifying osmesa-stat-server to be less sensitive to available statistics (since it can infer what's available from the content of the JSON column).

This also provides a path for moving the augmented_diffs columns (which is used for tracking which sequences have been applied to a changeset aggregate metrics) into the same (or separate) JSON column, making the backfill requirement more granular (just backfill missing data, not everything again).