datamade / war-chest

API for Chicago Aldermen's Campaign Funds
http://chicagoelections.datamade.us/war-chest/
1 stars 0 forks source link

Define a cycle and aggregate over cycles. #13

Closed fgregg closed 10 years ago

fgregg commented 10 years ago

For the alderman, we'll use the period from February 2007 to February 2011.

fgregg commented 10 years ago

@evz four new fields

  1. last-cycle-expenditures :
  2. last-cycle-receipts select sum(receipts), sum(expenditures) from report where committee_id=6301 and type='D-2 Semiannual Report' and period_from > '2007-07-01' and period_to < '2011-06-31' order by period_from;

3.current-cycle-expenditures 4.current-cycle-receipts select sum(receipts), sum(expenditures) from report where committee_id=6301 and type='D-2 Semiannual Report' and period_from > '2011-07-01' ;

fgregg commented 10 years ago

Okay, some candidates report not semiannually, but quarterly. Like our good friend Mr. Burke. http://www.elections.il.gov/CampaignDisclosure/CommitteeDetail.aspx?id=4410&pageindex=1

So we need a query like this:

select sum(receipts), sum(expenditures) from report 
inner join 
(select type, period_from, period_to, max(date_filed) as date_filed 
from report where committee_id = 4410 and (type like "D-2 Semiannual Report%" or type like "Quarterly%") 
group by replace(type, " (Amendment)", ""), period_from, period_to) 
using (type, period_from, period_to, date_filed) 
where period_from >= '2007-07-01' and period_to <= '2011-06-30' 
order by period_from;

It looks like this should be safe. I can't find any examples of overlapping semi-annual and quarterly reports.

fgregg commented 10 years ago

@evz can sqlalchemy handle this (particularly the replace part)?

evz commented 10 years ago

@fgregg This is what I got when I executed that query:

[(1125267.7100000002, 1075767.88)]

Look sensical?

fgregg commented 10 years ago

closed by 14920e48620a0b6115829eb5542ec6fd0240c886

fgregg commented 10 years ago
sqlite> select committee_id, type, period_from, period_to from report inner join (select type, period_from, period_to, max(date_filed) as date_filed, committee_id from report where committee_id = 22976 and (type like "D-2 Semiannual Report%" or type like "Quarterly%") group by replace(type, " (Amendment)", ""), period_from, period_to, committee_id) using (type, period_from, period_to, date_filed, committee_id) order by period_from;
22976|Quarterly|2011-07-01|2011-09-30
22976|Quarterly (Amendment)|2011-10-01|2011-12-31
22976|Quarterly|2012-04-01|2012-06-30
22976|Quarterly|2012-10-01|2012-12-31
22976|Quarterly|2013-01-01|2013-03-31
22976|Quarterly|2013-04-01|2013-06-30
22976|Quarterly|2013-07-01|2013-09-30

Is not picking up all the reports in the db. Frustratingly, the following seems to be correct:

select type, period_from, period_to, max(date_filed) as date_filed, committee_id from report where committee_id = 22976 and (type like "D-2 Semiannual Report%" or type like "Quarterly%") group by replace(type, " (Amendment)", ""), period_from, period_to, committee_id;
D-2 Semiannual Report|2010-07-01|2010-12-31|2011-01-23 21:59:22.000000|22976
Quarterly|2011-01-01|2011-03-31|2011-10-14 16:31:41.000000|22976
Quarterly|2011-04-01|2011-06-30|2011-10-14 16:32:07.000000|22976
Quarterly|2011-07-01|2011-09-30|2011-10-17 15:56:45.000000|22976
Quarterly (Amendment)|2011-10-01|2011-12-31|2012-07-16 14:25:08.000000|22976
Quarterly|2012-01-01|2012-03-31|2012-07-16 14:27:33.000000|22976
Quarterly|2012-04-01|2012-06-30|2012-07-16 14:55:33.000000|22976
Quarterly|2012-07-01|2012-09-30|2013-01-15 18:32:56.000000|22976
Quarterly|2012-10-01|2012-12-31|2013-01-15 18:55:03.000000|22976
Quarterly|2013-01-01|2013-03-31|2013-04-15 18:05:02.000000|22976
Quarterly|2013-04-01|2013-06-30|2013-07-15 08:45:48.000000|22976
Quarterly|2013-07-01|2013-09-30|2013-10-15 19:34:19.000000|22976

This is creating problems in the data, you can see this in Harry Osterman's case, where previous expenditures and receipts are incorrectly null.

@evz, @derekeder would you look at this query