dougestey / toronto-city-hall-api

Powering municipal apps in Toronto.
5 stars 2 forks source link

Map scraped data to models #5

Open dougestey opened 9 years ago

dougestey commented 9 years ago

Since we've decided to follow the specification established by the Popolo project - a move that will future-proof the API to be used in other places in the future - we now need to wire up the scraper DB to the appropriate models.

People

TODO: indicate which of these aren't being scraped properly at this time.

dougestey commented 9 years ago

I'm removing this from our list of priorities right now, because after going through the resulting data from ca_on_toronto I'm not happy with it yet. It has a lot of holes and yet-to-be-related models. I'll keep it open, because this is still a path I'd like to go down - but we do need to see updates to the scraper before that can happen.

We can still use it for motions and votes, however.

patcon commented 9 years ago

Sounds good. Specifics on what about it you're not happy with though?

dougestey commented 9 years ago

A few examples:

Bills (albeit without abstracts), voteevents and votecounts are being captured & related though, and these will be important for us from the get go.

jpmckinney commented 9 years ago

@dougestey Just saw this issue.

personcontactdetail not captured at all

The scraper stores the contact details on the Membership, not on the Person (because a councillor's email/address/etc. is related to their position in council, not to their personal identity). Unfortunately, the Imago API doesn't seem to expose contact details for Memberships; so the data is captured but not exposed. I wrote a pull request: https://github.com/opencivicdata/imago/pull/61 I can switch to my fork if this is a pressing issue.

personvote not relating to person, captured instead as a voter_name string

This is an upstream issue: https://github.com/opencivicdata/pupa/issues/147 Please chime in there so that the issue gets higher priority. I can bring it up with the Sunlight team if there's a pressing need.

post (ward posting) not relating to person

A Person holds a Post via a Membership - so Post should never be directly related to Person. There is perhaps nonetheless a related issue to be solved, if you can describe the issue more specifically.

Bills (albeit without abstracts)

For bill abstracts, I'm not sure where that data is (I wrote the scraper a while ago), but if it's scrapable, the scraper can be improved to capture that data.