Under some conditions, refreshing the materialized views may take upwards of 2 hours (hence the 2h timeout), resulting in apparent lag in the API (even when the raw_changesets data is up-to-date).
Replacing the materialized views with incrementally-updated rollup tables would improve the latency tremendously while reducing the amount of I/O done by Postgres.
User and hashtag statistics are facilitated by the
user_stats
andhashtag_stats
tables. These are updated continuously usinghousekeeping-loop.sh
.Under some conditions, refreshing the materialized views may take upwards of 2 hours (hence the
2h
timeout), resulting in apparent lag in the API (even when theraw_changesets
data is up-to-date).Replacing the materialized views with incrementally-updated rollup tables would improve the latency tremendously while reducing the amount of I/O done by Postgres.
This post summarizes the 2 approaches well: https://www.citusdata.com/blog/2018/10/31/materialized-views-vs-rollup-tables/
I see 2 ways to approach this:
Way 1 - manually
src/stats.js
to accompany the existing INSERTs/UPDATEs to theraw_changesets
tableThis keeps the aggregation logic in one place, but may under/over-count (in the rollup table) if errors occur.
Way 2 - triggers
raw_changesets