Closed drh-stanford closed 9 years ago
:+1:
Do we keep the data? If so how do we archive it?
Also the majority of these large number of rows is Stanford data, since we check those much more frequently.
To archive, you can write a .csv file and delete the rows, or migrate the rows to a statuses_history
table if you want them in the database still. It'll keep the statuses
indexes small and should speed up queries. There's like 200B sequential tuple reads on that table currently. It's not urgent though as the table is still pretty small (3GB).
Do you think table partitioning might be another acceptable approach here?
The statuses table in production has over a 1000 layers with 1000 or more status rows (4M rows total). Is there a reasonable limit for the number of status rows we keep for any given layer? If so, having a rake task that deletes (or archives) those older rows from the statuses table would be helpful.