Closed hancush closed 3 years ago
dbinit calls loaddivision, which is from python-opencivicdata, if you look at the guts of that management command, much of the work is ultimately done by this class https://github.com/opencivicdata/python-opencivicdata/blob/master/opencivicdata/divisions.py#L16
digging deeper, you can set an environmental variable for a filepath from which to load the divisions, instead of loading all known opencivicdata divisions from github.
https://github.com/opencivicdata/python-opencivicdata/blob/master/opencivicdata/divisions.py#L24
we could use that to create a custom csv that just has the divisions we need for latmetro
Thanks for the tip, @fgregg! python-opencivicdata
will only skip loading if the incoming divisions exactly match the database, so I submitted a PR to skip loading if the incoming set is a subset of the database contents: https://github.com/opencivicdata/python-opencivicdata/pull/139
Care to have a look?
Our scrape container runs
pupa dbinit
as the entry point. This command runs migrations, then loads divisions.Loading divisions is actually a fairly heavy operation. It looks like there is a bulk option, which could reduce the overhead. But, it won't work for databases containing data, because it deletes all divisions before loading, and jurisdictions (and probably also other models) reference divisions as protected foreign keys.
We either need to:
pupa dbinit
every time the scrapers run.loaddivisions
.Need to put this down for now. Any thoughts, @fgregg?
Example log (as these are cleaned up at a regular interval from the server):