propublica / congress-api-docs

Documentation for the ProPublica Congress API
https://projects.propublica.org/api-docs/congress-api/
53 stars 4 forks source link

Member responses are inconsistent in the data they return #243

Open ndawg opened 4 years ago

ndawg commented 4 years ago

Of the following endpoints:

  1. List of Members
  2. Get a Specific Member
  3. Get New Members
  4. Get Current Members by State/District
  5. Get Members Leaving Office

The data present for each member returned is not very consistent, with certain ommissions that don't seem to make too much sense. I'm not sure of the best way to communicate all of these discrepancies, but here's my attempt:

field 1 2 3 4 5
first_name
middle_name
last_name
suffix
district
id
party
api_uri
gender
state
facebook_account
twitter_account
youtube_account
date_of_birth
last_updated
govtrack_id
cspan_id
votesmart_id
icpsr_id
crp_id
google_entity_id
url
rss_url
in_office
title
short_title
leadership_role
seniority
contact_form
fec_candidate_id
office
phone
fax
missed_votes_pct
votes_with_party_pct
ocd_id
dw_nominate
ideal_point
total_votes
missed_votes
total_present
senate_class
state_rank
lis_id
next_election
times_topics_url
times_tag
most_recent_vote
chamber
start_date
name
end_date
status
note

Some notes to explain the warning signs:

There are a few fields omitted from the second return that are present in the subfield roles - these values are noted with ✳. In addition, the roles field contains the following values that the first response does not include at all: at_large, bills_sponsored, and bills_cosponsored.

Other than this, it seems like a lot of the missing information could be included in the various responses. If you'd like me to be more thorough about what I think needs to be added in certain places, I can do that too, but it seems like it's all over the place at the moment.

dwillis commented 4 years ago

@ndawg thanks for this - it's a big effort, and you're right, we should standardize things better. Sounds like something we can do for a V2 version.

ndawg commented 4 years ago

I agree v2 is probably a better opportunity to overhaul these responses. For now, I think it's worth shoring up the responses from the "Get a Specific Member" in particular, which seems like it should return all possible information. So, the changes required:

It would be nice to include their current retiring status (status and note), perhaps wrapped in a retiring field that is null unless they are retiring.

Also, it looks like title from response 1 corresponds to role in response 4, which I missed on the table, so I'll fix that above.

dwillis commented 4 years ago

@ndawg Seems reasonable. I would note that, for "Get a Specific Member", party is actually present in the roles array, as it is attached to a specific timeframe. We use current_party at the top level because people can and do change parties in the course of a congress. For the last set of attributes you'd like included, those seem like they should be in the roles array as well, not the top level, yes?

ndawg commented 4 years ago

@dwillis interesting about the current_party distinction, I wasn't aware people were flipping parties mid-session. And yeah, that's correct, they should be in the roles array.

dwillis commented 4 years ago

@ndawg I've started on this - we've deployed an update to "Get a Specific Member" responses and updated the example.

ndawg commented 4 years ago

@dwillis for listing members, what do you think about including a role field that contains the member's current position information? As in, the same as roles when getting a specific member, but there is just one role attached (without committee and subcommittee information in that role).

This isn't necessarily a breaking change but would help organize things really well. Just an idea - having the information there at all is already miles better!

dwillis commented 4 years ago

@ndawg let me check into it - I guess my first thought would be that the list of members includes everyone who has served in a given congress, even if they are not current members, so that would mean we'd need to figure out how to treat a role attribute for those folks.

ndawg commented 4 years ago

I haven't had too much time to look into this more, but I did go ahead and update the table. I think there might be a few inaccuracies (not sure about geoid and at_large, for instance). However, it definitely looks much healthier than the original.

Note that this is just comparing the List of Members and Get a Specific Member endpoint for now.

fields list get
id
title
short_title
api_uri
first_name
middle_name
last_name
suffix
date_of_birth
gender
party
leadership_role
twitter_account
facebook_account
youtube_account
govtrack_id
cspan_id
votesmart_id
icpsr_id
crp_id
google_entity_id
fec_candidate_id
url
rss_url
contact_form
in_office
cook_pvi
dw_nominate
ideal_point
seniority
next_election
total_votes
missed_votes
total_present
last_updated
ocd_id
office
phone
fax
state
district
at_large
geoid
missed_votes_pct
votes_with_party_pct
votes_against_party_pct
times_topics_url
times_tag
current_party
most_recent_vote
ndawg commented 4 years ago

Here's a more thorough and well-vetted table, again only comparing the list and get methods, including all the sub-properties of the roles entries. I compiled it for all members of Congress in both chambers, so it has some mutually exclusive fields, like district versus senate_class.

fields list get
id
title
short_title
api_uri
first_name
middle_name
last_name
suffix
date_of_birth
gender
party
leadership_role
twitter_account
facebook_account
youtube_account
govtrack_id
cspan_id
votesmart_id
icpsr_id
crp_id
google_entity_id
fec_candidate_id
url
rss_url
contact_form
in_office
cook_pvi
dw_nominate
ideal_point
seniority
next_election
total_votes
missed_votes
total_present
last_updated
ocd_id
office
phone
fax
state
district
at_large
geoid
missed_votes_pct
votes_with_party_pct
votes_against_party_pct
senate_class
state_rank
lis_id
member_id
times_topics_url
times_tag
current_party
most_recent_vote
congress
chamber
start_date
end_date
bills_sponsored
bills_cosponsored