geoblacklight / geomonitor

DEPRECATED: See https://github.com/geoblacklight/geo_monitor
https://github.com/geoblacklight/geo_monitor
13 stars 7 forks source link

rake task to clean statuses table #41

Closed drh-stanford closed 9 years ago

drh-stanford commented 9 years ago

The statuses table in production has over a 1000 layers with 1000 or more status rows (4M rows total). Is there a reasonable limit for the number of status rows we keep for any given layer? If so, having a rake task that deletes (or archives) those older rows from the statuses table would be helpful.

mejackreed commented 9 years ago

:+1:

Do we keep the data? If so how do we archive it?

mejackreed commented 9 years ago

Also the majority of these large number of rows is Stanford data, since we check those much more frequently.

drh-stanford commented 9 years ago

To archive, you can write a .csv file and delete the rows, or migrate the rows to a statuses_history table if you want them in the database still. It'll keep the statuses indexes small and should speed up queries. There's like 200B sequential tuple reads on that table currently. It's not urgent though as the table is still pretty small (3GB).

mejackreed commented 9 years ago

Do you think table partitioning might be another acceptable approach here?