palewire / django-calaccess-campaign-browser

A Django app to refine, review and republish campaign finance data drawn from the California Secretary of State’s CAL-ACCESS database
http://django-calaccess-campaign-browser.californiacivicdata.org
MIT License
17 stars 12 forks source link

Reconcile candidate filers with their Sunlight Foundation OpenStates unique ID #187

Open palewire opened 9 years ago

palewire commented 9 years ago

IDs downloadable here. Linking the two would allow us to connect with outside databases.

JoeGermuska commented 9 years ago

I'm interested in this one. I have questions related to what I raised in california-civic-data-coalition/django-calaccess-campaign-browser#173 about the need for human review to either ratify or audit any automated linking. The names I saw in campaign data are pretty messy ("Tax Freeze" Craig Freis)...

palewire commented 9 years ago

I think there would almost certainly have to be human review. The best list we have in the database is the one we scrape from their elections page, which winnows down the huge pool of filers to those who have participated in real elections.

My hope is that for past contests this would be one-time work that we could save as a fixture or crosswalk or whatever. Crazy?

JoeGermuska commented 9 years ago

Not totally crazy. I'm going to be talking with some Chicago folks about overlapping issues. Will keep you posted.

rkiddy commented 9 years ago

The OpenStates people only have IDs for the elected members of the state legislature. Am I right about that? That is not a large fraction of the names that appear in the Cal-Access data. Of course, it absolutely is worth doing.

I think that, eventually, there will be a need for a calaccess_campaign_browser_person table. Candidates are important, but treasurers, filers, signers (ie sig_naml), partners (ie prn_naml), recipients (of money, ie rcpt_naml), payees, "entities" (ie enty_naml) and all of the other names in the database will have to, somehow, be disambiguated so that the different forms of any name points to the "person" row, the actual identity for that person.

That "person" table should definitely have a way to record the OpenStates ID, the FEC ID and whatever others are connected to.

aboutaaron commented 9 years ago

+1 to creating a person table. Currently, we're not really tracking folks like treasurers, agents, etc. and I'd love to have a way to follow and link those folks together.