The map has changed a while back to only show dots for areas, which have been updated in the past year.
The underlying datamap_* tables still contain the rows for older times. And the datamaps script is pulling out all rows, independent of their modified date and putting them onto disk, as there is no database index on the modified column.
To make this more efficient, we should:
add an index on the modified column of the datamap_* tables
change the SQL query in scripts/datamaps.py to filter out any rows older than a year
add a new recurring job to delete rows from the datamap_* tables if they weren't modified in the last year
I did a manual clean-up of the tables to remove any rows with modified < '2015-11-01' and optimized the tables afterwards, so there isn't a big one-time backlog to process.
The map has changed a while back to only show dots for areas, which have been updated in the past year.
The underlying
datamap_*
tables still contain the rows for older times. And the datamaps script is pulling out all rows, independent of their modified date and putting them onto disk, as there is no database index on the modified column.To make this more efficient, we should:
datamap_*
tablesscripts/datamaps.py
to filter out any rows older than a yeardatamap_*
tables if they weren't modified in the last yearI did a manual clean-up of the tables to remove any rows with
modified < '2015-11-01'
and optimized the tables afterwards, so there isn't a big one-time backlog to process.