fecgov / openFEC

The first RESTful API for the Federal Election Commission. We're aiming to make campaign finance more accessible for journalists, academics, developers, and other transparency seekers.
https://api.open.fec.gov/developers
Other
483 stars 106 forks source link

Restrict election aggregates to election year #1382

Closed jmcarp closed 9 years ago

jmcarp commented 9 years ago

[Sorry for a very long issue!]

The election aggregates proposed in #1352 use cand_cmte_linkage.fec_election_yr to join candidates on committees and two-year periods. As @PaulClark2 pointed out, this can cause problems when linkages have fec_election_yr values that extend beyond the corresponding cand_election_year (which is the year of the actual election):

I'm looking at HRC's 2016 full cycle numbers. The 2013-2014 time period displays $29 million in receipts but that money was for her 2008 campaign. I wonder if it's a linkage problem. The 2016 links for HRC look correct in FECP.

I was hoping we could resolve this issue by using both cand_election_yr (to figure out which election we're talking about) and fec_election_yr (to figure out which two-year period we're talking about). This seems to work fine for Senate candidates, where cand_election_yr and fec_election_yr values tend to look like this:

fec=> select cand_id, cand_election_yr, fec_election_yr from cand_cmte_linkage where cand_id = 'S2KY00012' and cmte_dsgn in ('P', 'A') and cand_election_yr = 2014;
  cand_id  | cand_election_yr | fec_election_yr
-----------+------------------+-----------------
 S2KY00012 |             2014 |            2014
 S2KY00012 |             2014 |            2012
 S2KY00012 |             2014 |            2010
(3 rows)

But for many Presidential candidates, fec_election_yr values are greater than or equal to their corresponding cand_election_yr values:

fec=> select cand_id, cand_election_yr, fec_election_yr from cand_cmte_linkage where cand_id = 'P80003338' and cmte_dsgn in ('P', 'A') and cand_election_yr = 2012;
  cand_id  | cand_election_yr | fec_election_yr
-----------+------------------+-----------------
 P80003338 |             2012 |            2014
 P80003338 |             2012 |            2012
 P80003338 |             2012 |            2016
(3 rows)

In other words, as far as I can tell, we don't know Barack Obama's linked committee(s) for the 2009-2010 two-year period for the 2012 election.

I chatted with @PaulClark2 about this yesterday, so I think I understand the data correctly, but I may still be mistaken. Anyway, I see a few ways to handle this:

Thoughts @PaulClark2 @jwchumley @wluoFEC @LindsayYoung @noahmanger? Also, would you all be opposed to merging the current code, and then adjusting after we discuss, or should we hold off on merging until we've revised this behavior?

LindsayYoung commented 9 years ago

Could we treat all future fec_election_yr as the current cycle for purposes of merging? Like having a row like that pre-calculated that we just use for this merge. Maybe that is adding more complexity, but it seems simpler to me.

As for the third option, would changing cand_cmte_linkage mess with the 2-year periods? we have done a lot of vetting with them so I am cautious about changes that might introduce unforeseen errors.

PaulClark2 commented 9 years ago

Most presidential candidates do not declare more than two years before the election. So in the 2012 Obama example, he didn't have a PCC in 2010 for the 2012 election because he didn't file his statement of candidacy until April 2011.

So if we are looking at Obama's 2012 campaign the correct answer for 2009-2010 is he didn't have a PCC. All of the activity reported on OFA's filings in 2009 and 2010 are related to the 2008 election.

select cand_id, cmte_id, cand_election_yr, fec_election_yr from cand_cmte_linkage where cand_id = 'P80003338' and fec_election_yr in (2008, 2010, 2012) and cmte_tp = 'P' and cmte_dsgn in ('P', 'A') order by fec_election_yr;

CAND_ID | CMTE_ID | CAND_ELECTION_YR | FEC_ELECTIONYR __|__|___|____ P80003338 | C00431445 | 2008 | 2008 P80003338 | C00431445 | 2008 | 2010 P80003338 | C00431445 | 2012 | 2012

jmcarp commented 9 years ago

@PaulClark2 oh, that sounds like the point I was missing! In that case, given that Obama didn't have a PCC for 2012 in 2009-2010, would you expect Obama's 2009-2012 aggregates to only include fundraising by his PCC in 2011-2012? If so, this becomes very simple to address.

PaulClark2 commented 9 years ago

Great! I probably wasn't as clear about that point as I should have been.

jmcarp commented 9 years ago

Now that I think about it, the last option we just discussed doesn't seem right either. It might be technically correct, but it probably isn't what users expect. At the risk of pedantic repetition, that option would only include 2011-2012 totals for the election aggregate for Barack Obama's 2012 Presidential campaign, since the candidate didn't have a PCC for the 2012 election in 2009-2010. That means the candidate's four-year totals (2009-2012) would be the same as his 2011-2012 totals. The 2009-2010 and 2011-2012 totals also wouldn't add up to the 2009-2012 totals, which also seems problematic. It seems like this would defeat the purpose of election aggregates for Presidential candidates (although they would work fine for Senate candidates).

I'm wondering if we could resolve this more quickly with a short video chat than over GitHub, since I'm still not sure how best to handle this. What do you think @PaulClark2 @LindsayYoung?

PaulClark2 commented 9 years ago

Ok .. I spoke with a few folks here at FEC and we think we want to just do with specific time slices. For example for Obama's 2012 campaign we show 2009-2012 activity with out regard for the 2009-2010 reports disclosing 2008 activity. Of course, we'll need to have some explanatory to explain the 2, 4 or 6-year time periods might include financial activity from different elections.

jmcarp commented 9 years ago

Thanks @PaulClark2. That's the behavior we have in place now, so I'm going to close this issue. I agree that we should explain what's going on under the hood. @emileighoutlaw, do you have time to think about this? It's kind of a thorny issue, so it might be useful to talk it through quickly at some point.