Closed sbma44 closed 13 years ago
have asked Eric for his take on why there might be two Blanch Lincolns at that first link. Given the duplicates, I'm guessing that this is either an API error or that we just need to filter the records that are coming back in some way.
All congressperson pictures are identified by bioguide ID. On the first link, the second "Blanche Lincoln" has a bioguide ID of L000555, which as it turns out, is a valid bioguide ID for her, but as a redirect to an alternate spelling of her name: http://bioguide.congress.gov/scripts/biodisplay.pl?index=L000555
On the second page, Rod Grams has been out of office since 2001, and our Legislator API doesn't go back that far (it goes back to, I believe, the 110th Congress that started in Jan of 2007): http://bioguide.congress.gov/scripts/biodisplay.pl?index=G000367
Same for Gilbert Gutknecht, he's just too old: http://bioguide.congress.gov/scripts/biodisplay.pl?index=G000536
On the third page, Christensen is the victim of another separate-ID-for-alternate-spelling: http://bioguide.congress.gov/scripts/biodisplay.pl?index=C001039
How do we end up with these alternate bioguide IDs? They don't appear in our Sunlight API, so I guess Capitol Words does name->bioguide resolution using an alternate source?
For the issue with older Congresspersons, this is going to be widespread if we're going back far enough. Here are two issues missing Grams, and also Tom Daschle: http://capitolwords.org/date/2000/12/14/S11774-2-serving-in-the-senate http://capitolwords.org/date/1998/09/17/S10501-2-sense-of-the-senate-regarding-puerto-rico
I think our choices there are either to do a very comprehensive retroactive update of our photo database, or find a cute "this user hasn't uploaded a profile picture!" image to use where we don't have one.
Thanks for looking at this so quickly! Yeah, Aaron filled me in shortly after I emailed you and told me that the project's wide date range forced him to use a custom solution. Sounds like either Javascript suppression or getting Tim to generate a list of 404s (though I doubt logging was turned on for that bucket) for generic image placement is the way to go. Unfortunately the S3 media hosting eliminates the possibility of a more elegant nginx fix.
Perhaps I'm missing something and there's a way to detect a missing photo in advance during page rendering, but I don't see how. I guess we could look the bioguide up against a list of all the photos we have -- maybe that's not too inefficient. Hacky, though.
I think your assessment is basically right, though generating a JavaScript array of known valid bioguide IDs from the legislator API is not hard to do and keep up to date if CW downloads the CSV of legislator info every night. If the bioguide ID from the CR isn't in the array, display the no-photo pic.
As to the alternate solution of filling out our photo database - James has some code to automatically fetch and correctly size legislator photos, I don't know how easily it could be adapted to serve older Congresses. It'd be neat if we could just take the list of distinct bioguide IDs that appear in the CapitolWords database and use it to generate a much more complete set of photos. It'd be fine to have photos of legislators that don't appear in the API.
Fixed in d0359d, current behavior is to hide images. A slight tweak could replace them with our 'no photo available' placeholder.
Missing photo for Blanche Lincoln at http://capitolwords.org/legislator?chamber=&party=D&state=AR&congress=109
More missing photos at http://capitolwords.org/legislator?chamber=&party=R&state=MN&congress=105
Missing photo for Christensen at http://capitolwords.org/legislator?chamber=House&party=&state=&congress=112