alexsherman / leeg

2 stars 0 forks source link

partition all_matches based on time or patch #48

Closed alexsherman closed 5 years ago

alexsherman commented 5 years ago

We should partition the all_matches table into logical partitions by time or patch version. The reason is that we can drop old partitions when they are no longer useful.

For our live req service, people want reqs that work for them now - the last patch, last week, last month, season, etc. So if we partition smartly, when a patch's data is deemed old enough not to matter for live recommendations (1 patch? 3 patch sliding window? we can decide), we can download that partition and store it somewhere for safekeeping, calculate and store some stats about that patch and keep that in a live table (for, say, graphs of champ performance over time, which doesn't require all matches actually in postgres), and then drop that partition. that way, our main database can have a somewhat consistent total size over time (once we reach that size - let's ballpark like 3 million matches per patch partition, maybe 10 live partitions ~~ something like 30gb of matches. we ain't even big data)

https://www.postgresql.org/docs/11/ddl-partitioning.html https://dba.stackexchange.com/questions/106014/how-to-partition-existing-table-in-postgres