speaker_raw capitalization

haincha commented 11 years ago

Nothing too crazy. Just something I noticed. Trying to pull the ['speaker_raw'] field gives you a lower case phrase. mr. such andso. Is there anyway this can be corrected, or would this have to be manually fixed in each article in the database?

drinks commented 11 years ago

@haincha, First and foremost, sorry! For some reason I didn't get an email about this issue. I wouldn't recommend using the speaker_raw field for any practical display use--it's not intended to be more than a paper trail of the actual text encountered in the record, more useful in debugging than anything worthwhile. In most cases, the original text is all uppercase--so there's not a great naïve source of data for correctly casing names to begin with, but you're correct in the observation that our Solr index stores it case-insensitively. I'd point you instead toward a combination of speaker_first and speaker_last, or even resolving the speaker_bioguide against something like https://github.com/unitedstates/congress-legislators to get names.

haincha commented 11 years ago

Oh, it is quite okay. It was just something I had encountered while working through the Codecademy API course. I appreciate the email back.

Chase — Sent from Mailbox for iPhone

On Thu, Feb 21, 2013 at 1:37 PM, Dan Drinkard notifications@github.com wrote:

@haincha, First and foremost, sorry! For some reason I didn't get an email about this issue. I wouldn't recommend using the speaker_raw field for any practical display use--it's not intended to be more than a paper trail of the actual text encountered in the record, more useful in debugging than anything worthwhile. In most cases, the original text is all uppercase--so there's not a great naïve source of data for correctly casing names to begin with, but you're correct in the observation that our Solr index stores it case-insensitively. I'd point you instead toward a combination of speaker_first and speaker_last, or even resolving the speaker_bioguide against something like https://github.com/unitedstates/congress-legislators to get names.

Reply to this email directly or view it on GitHub: https://github.com/sunlightlabs/Capitol-Words/issues/59#issuecomment-13911401

propublica / Capitol-Words

speaker_raw capitalization #59