Adding a new statistic currently involves modifying the batch aggregator, the streaming aggregator, the database schema, and osmesa-stat-server.
If we move aggregations into a JSON column, we eliminate the need to change the database schema each time (which usually requires doing a backfill). We also gain a path to simplifying osmesa-stat-server to be less sensitive to available statistics (since it can infer what's available from the content of the JSON column).
This also provides a path for moving the augmented_diffs columns (which is used for tracking which sequences have been applied to a changeset aggregate metrics) into the same (or separate) JSON column, making the backfill requirement more granular (just backfill missing data, not everything again).
Adding a new statistic currently involves modifying the batch aggregator, the streaming aggregator, the database schema, and osmesa-stat-server.
If we move aggregations into a JSON column, we eliminate the need to change the database schema each time (which usually requires doing a backfill). We also gain a path to simplifying osmesa-stat-server to be less sensitive to available statistics (since it can infer what's available from the content of the JSON column).
This also provides a path for moving the
augmented_diffs
columns (which is used for tracking which sequences have been applied to a changeset aggregate metrics) into the same (or separate) JSON column, making the backfill requirement more granular (just backfill missing data, not everything again).