Closed AlJohri closed 9 years ago
Hey sorry for being absent on stuff lately, I just wanted to drop in and say I think this is an awesome idea, and would love a pr. I'm interested in hearing more about special title resolution; this is something I took a crack at almost 2 years ago now, and though I actually had success with it, that branch of code is now collecting dust on my harddrive. It's been my impression that on days when someone serves as speaker pro tem or similar they're usually not saying anything in their own capacity as a legislator, but if you think it will yield good stuff I'm all for it.
I'd like to resolve metadata about a speaker from https://github.com/unitedstates/congress-legislators and place it within the parsed XML CrDoc.
This would be similar to the
db_bioguide_lookup
from the CapitolWords Solr ingestor (https://github.com/sunlightlabs/Capitol-Words/blob/master/solr/lib.py#L151) however it would simply check the YAML file instead of thebioguide
andNYT
APIs.Similar to the CapitolWords method
get_speaker_metadata
(https://github.com/sunlightlabs/Capitol-Words/blob/master/solr/ingest.py#L218) it would strip the "Mr/Ms/Mrs" at the beginning of the speaker's title and find a legislator that matched the same last name and had a term that matched the same year as the speech.Lastly, I also wanted to use the
congress-legislators
repository to resolve speaker's who name resolves to "special titles" such as "speaker pro tempore", "vice president", "president", "recorder", etc.If the title represents a person, it would resolve the correct legislator given the correct year; if the title represents something like recorder, it would set a field as thus.
I was thinking of just adding an option such as:
--resolve-legislators
to perform this parsing.Would you be interested in such a PR?
CC: @drinks