Closed fgregg closed 5 years ago
@hancush this is ready for me to indoctrinate you.
logging to return to later: some memberships don't have end dates in the ocd api, e.g., rahm https://ocd.datamade.us/ocd-person/f649753d-081d-4f22-8dcf-3af71de0e6ca/
in django-councilmatic, we store null values as null, but ocd stores null values as empty strings.
this causes things to break when we perform operations that assume a certain type, e.g.,
this may come to bear on other fields, too.
logging for myself: chicago doesn't have bill documents!
select * from opencivicdata_billdocument as doc join opencivicdata_bill as bill on doc.bill_id = bill.id join opencivicdata_organization as org on bill.from_organization_id = org.id where org.jurisdiction_id = 'ocd-jurisdiction/country:us/state:il/place:chicago/government';
id | note | date | bill_id | created_at | updated_at | extras | id | identifier | title | classification | subject | legislative_session_id | from_organization_id | locked_fields | created_at | updated_at | extras | id | name | image | classification | founding_date | dissolution_date | jurisdiction_id | parent_id | locked_fields
----+------+------+---------+------------+------------+--------+----+------------+-------+----------------+---------+------------------------+----------------------+---------------+------------+------------+--------+----+------+-------+----------------+---------------+------------------+-----------------+-----------+---------------
(0 rows)
@fgregg making good progress here.
can you say more about:
?
re: add order_by to meta class on EventAgendaItem
, there's already clean_agenda_items
which orders and dedupes agenda items.
we already use this property to display agenda items. should be able to continue using with minor changes, e.g., i think it's event.agenda
to get the agenda items (not event.agenda_items
).
if you add ordering to the model class, it will, by default return querysets in that order. This is just an opportunity to clean up some code, and not required for this migration.
https://docs.djangoproject.com/en/2.2/ref/models/options/#ordering
tl;dr - Remove the need for extensive and error-prone ETL by basing the Councilmatic app off Open Civic Data models, i.e., a jurisdiction-specific database fed by the scrapers, rather than maintaining a remote database containing a set of similar, Councilmatic-specific models.
There is a Python package containing Django versions of the OCD models. The relevant ones live in opencivicdata.core.models
and opencivicdata.legislative.models
.
Further reading:
Leverage multi-table inheritance to add additional fields to OCD models. The primary use case in the refactor is adding slugs to first-class models – Person, Event, Bill, and Organization - and adding a headshot to Person.
Since the scrapers create OCD objects, we added signals to each of the subclassed models to create the related Councilmatic model on save. Read more on Django signals. »
We have also added a management command to import headshot files.
We make extensive use of proxy models to add custom managers and additional properties and methods to classes for use in Councilmatic code. This is an issue because the proxied models return OCD objects, not Councilmatic objects. To get around that, we used the hacky but effective django-proxy-overrides
library to force models to return related objects as Councilmatic models, as needed.
To customize the model of related objects, proxy a model like opencivicdata.legislative.models.BillAction
-> councilmatic_core.models.BillAction
, and override one or more related object attributes with a ProxyForeignKey
.
A brief summary of alternative approaches and why we ultimately did not choose to use them can be found here.
import_data
command
This PR changes django-councilmatic to use opencivicdata models.
In concert with https://github.com/datamade/chi-councilmatic/pull/252, these code changes can largely reproduce the Chicago Councilmatic site.
TODO: