sunlightlabs / read_FEC

Turn raw electronic FEC filings into meaningful data
http://realtime.influenceexplorer.com
BSD 3-Clause "New" or "Revised" License
19 stars 7 forks source link

filtering out subitemized amounts in candidate IE totals #125

Open boblannon opened 9 years ago

boblannon commented 9 years ago

It seems like these aggregates should be ignoring lines with memo_code='X': https://github.com/sunlightlabs/read_FEC/blob/master/fecreader/summary_data/management/commands/make_candidate_os_aggregates.py#L36-L38

@jsfenfen can i get your input?

jsfenfen commented 9 years ago

There's a cron job that sets the superceded_by_amendment flag to true in the skede model on lines that have an X or something in the memo field. That's a hack--there should be a separate field--but that's how it's been working.

There's also a superceded_by_amendment field on the new_filing model that indicates whether the entire filing has been superceded.

On Friday, August 14, 2015, Bob Lannon notifications@github.com wrote:

It seems like these aggregates should be ignoring lines with memo_code='X':

https://github.com/sunlightlabs/read_FEC/blob/master/fecreader/summary_data/management/commands/make_candidate_os_aggregates.py#L36-L38

@jsfenfen https://github.com/jsfenfen can i get your input?

— Reply to this email directly or view it on GitHub https://github.com/sunlightlabs/read_FEC/issues/125.

boblannon commented 9 years ago

are superceded_by_amendment and memo_code related? The difference in the aggregate number is explained by including subitemizations...

jsfenfen commented 9 years ago

Lets take a step back.

Realtime uses superceded_by_amendment on the skede lines as a catchall flag to say "ignore this itemization in certain contexts". So sked e lines that are marked superceded_by_amendment are either amendments or are subitemizations. This was a bad choice, btw, it's just that it's really hard to know when to include subitemizations or not. Look at the file generation routines--I think they are ignored for dumping big files, but included for dumping single filngs, maybe?

The problem, in general, is that some users will want to see the subitemizations, and others won't.

A better approach is probably to always show subitemizations in the downloaded files, and just let the users be confused.

boblannon commented 9 years ago

ah, okay. will continue digging.

jsfenfen commented 9 years ago

Also see this: https://github.com/sunlightlabs/read_FEC/blob/master/fecreader/formdata/management/commands/mark_superceded_body_rows.py -- superceded_by_amendment is also set to true on immediate notice forms once the periodic forms have been received. Again, this was a bad choice of flag name.

Also see this: https://github.com/sunlightlabs/read_FEC/blob/master/fecreader/summary_data/management/commands/mark_amended_skede_lines.py

boblannon commented 9 years ago

okay, but back to the question about aggregate numbers: is there ever a reason to include subitemizations when aggregating? the FEC suggests they shouldn't be included in committee totals.

jsfenfen commented 9 years ago

No, of course not, but they are set to superseded_by_amendment at a really basic level--the bigger issue to me is making sure that realtime is correctly excluding them.

This is where it happens, btw:

https://github.com/sunlightlabs/read_FEC/blob/master/fecreader/formdata/utils/filing_body_processor.py#L29

On Fri, Aug 14, 2015 at 10:41 AM, Bob Lannon notifications@github.com wrote:

okay, but back to the question about aggregate numbers: is there ever a reason to include subitemizations when aggregating? the FEC suggests they shouldn't be included in committee totals.

— Reply to this email directly or view it on GitHub https://github.com/sunlightlabs/read_FEC/issues/125#issuecomment-131191802 .

jsfenfen commented 9 years ago

I feel like spaces have already been eaten before this, but is it possible that at the time these are processed they are marked with something else--like lower case 'x' or other characters? You'd have to follow the whole logic through to this, but...

On Fri, Aug 14, 2015 at 10:46 AM, jacob fenton < jfenton@sunlightfoundation.com> wrote:

No, of course not, but they are set to superseded_by_amendment at a really basic level--the bigger issue to me is making sure that realtime is correctly excluding them.

This is where it happens, btw:

https://github.com/sunlightlabs/read_FEC/blob/master/fecreader/formdata/utils/filing_body_processor.py#L29

On Fri, Aug 14, 2015 at 10:41 AM, Bob Lannon notifications@github.com wrote:

okay, but back to the question about aggregate numbers: is there ever a reason to include subitemizations when aggregating? the FEC suggests they shouldn't be included in committee totals.

— Reply to this email directly or view it on GitHub https://github.com/sunlightlabs/read_FEC/issues/125#issuecomment-131191802 .

boblannon commented 9 years ago

okay, thanks so much for the info. i'll check this out.