washingtonpost / elex-live-model

a model to generate estimates of the number of outstanding votes on an election night based on the current results of the race
48 stars 5 forks source link

ELEX-3468 save unit level turnout prediction from bootstrap model to s3 #104

Closed dmnapolitano closed 1 month ago

dmnapolitano commented 1 month ago

Description

Hi! The changes in this PR:

Please let me know what you think and if this is what you're looking for 😬

Jira Ticket

ELEX-3468

Test Steps

tox and an elexmodel command with --save_output results such as:

elexmodel 2022-11-08_USA_G --estimands=margin --features=baseline_normalized_margin --office_id=G_county --geographic_unit_type=county --pi_method bootstrap --national_summary --save_output results

Confirm in the s3 bucket under unit_data/ and each aggregate (except nat_sum_data/) that pred_turnout now appears in every CSV 🎉

lennybronner commented 1 month ago

This looks great, thanks for the refactor. I'm curious though, how does pred_turnout appear in the aggregated csvs? especially cuz it's not in the printout we get at the end of running the model (but pred_turnout is there for the units if we include those)

dmnapolitano commented 1 month ago

This looks great, thanks for the refactor. I'm curious though, how does pred_turnout appear in the aggregated csvs? especially cuz it's not in the printout we get at the end of running the model (but pred_turnout is there for the units if we include those)

Wait, for real? I thought I saw it there. Let me check 🤔

Also thanks 😄

dmnapolitano commented 1 month ago

This looks great, thanks for the refactor. I'm curious though, how does pred_turnout appear in the aggregated csvs? especially cuz it's not in the printout we get at the end of running the model (but pred_turnout is there for the units if we include those)

Wait, for real? I thought I saw it there. Let me check 🤔

Also thanks 😄

Ok, maybe I'm looking at the wrong stuff 🤔 Here's what I did:

  1. git checkout this branch
  2. pip install -e . --force-reinstall to be ultra-sure this branch is installed
  3. Ran the elexmodel command above

My output (on stdout) is this:

state_data 
    postal_code  pred_margin  results_margin  reporting  pred_turnout  lower_0.7_margin  upper_0.7_margin  lower_0.9_margin  upper_0.9_margin
0           AL    -0.393030       -0.393030       67.0     1356532.0         -0.393030         -0.393030         -0.393030         -0.393030
1           AR    -0.283264       -0.283264       75.0      888040.0         -0.283264         -0.283264         -0.283264         -0.283264
2           AZ     0.006689        0.006689       15.0     2558664.0          0.006689          0.006689          0.006689          0.006689
3           CA     0.183590        0.183590       58.0    10933009.0          0.183590          0.183590          0.183590          0.183590
4           CO     0.198016        0.198016       64.0     2451521.0          0.198016          0.198016          0.198016          0.198016
5           CT     0.130843        0.130843      169.0     1256542.0          0.130843          0.130843          0.130843          0.130843
6           FL    -0.195396       -0.195396       67.0     7719252.0         -0.195396         -0.195396         -0.195396         -0.195396
7           GA    -0.076204       -0.076204      159.0     3921799.0         -0.076204         -0.076204         -0.076204         -0.076204
8           HI     0.264236        0.264236        4.0      411159.0          0.264236          0.264236          0.264236          0.264236
9           IA    -0.189771       -0.189771       99.0     1192095.0         -0.189771         -0.189771         -0.189771         -0.189771
10          ID    -0.498027       -0.498027       44.0      478743.0         -0.498027         -0.498027         -0.498027         -0.498027
11          IL     0.122584        0.122584      102.0     3915537.0          0.122584          0.122584          0.122584          0.122584
12          KS     0.021666        0.021666      105.0      963618.0          0.021666          0.021666          0.021666          0.021666
13          MA     0.291131        0.291131      350.0     2393921.0          0.291131          0.291131          0.291131          0.291131
14          MD     0.335384        0.335384       24.0     1937978.0          0.335384          0.335384          0.335384          0.335384
15          MI     0.107078        0.107078       83.0     4386296.0          0.107078          0.107078          0.107078          0.107078
16          MN     0.079106        0.079106       87.0     2432290.0          0.079106          0.079106          0.079106          0.079106
17          NE    -0.247787       -0.247787       93.0      632948.0         -0.247787         -0.247787         -0.247787         -0.247787
18          NH    -0.156591       -0.156591      238.0      608874.0         -0.156591         -0.156591         -0.156591         -0.156591
19          NM     0.065389        0.065389       33.0      694732.0          0.065389          0.065389          0.065389          0.065389
20          NV    -0.014417       -0.014417       17.0      968663.0         -0.014417         -0.014417         -0.014417         -0.014417
21          NY     0.057081        0.057081       62.0     5734113.0          0.057081          0.057081          0.057081          0.057081
22          OH    -0.255851       -0.255851       88.0     4025984.0         -0.255851         -0.255851         -0.255851         -0.255851
23          OK    -0.140599       -0.140599       77.0     1120306.0         -0.140599         -0.140599         -0.140599         -0.140599
24          OR     0.037754        0.037754       36.0     1767421.0          0.037754          0.037754          0.037754          0.037754
25          PA     0.150363        0.150363       67.0     5262621.0          0.150363          0.150363          0.150363          0.150363
26          RI     0.196523        0.196523       39.0      345130.0          0.196523          0.196523          0.196523          0.196523
27          SC    -0.175953       -0.175953       46.0     1681192.0         -0.175953         -0.175953         -0.175953         -0.175953
28          SD    -0.275992       -0.275992       66.0      340140.0         -0.275992         -0.275992         -0.275992         -0.275992
29          TN    -0.326978       -0.326978       95.0     1700250.0         -0.326978         -0.326978         -0.326978         -0.326978
30          TX    -0.111414       -0.111414      254.0     7965807.0         -0.111414         -0.111414         -0.111414         -0.111414
31          VT    -0.494894       -0.494894      246.0      264515.0         -0.494894         -0.494894         -0.494894         -0.494894
32          WI     0.034526        0.034526       72.0     2627090.0          0.034526          0.034526          0.034526          0.034526
33          WY    -0.648114       -0.648114       23.0      174352.0         -0.648114         -0.648114         -0.648114         -0.648114 

unit_data 
      postal_code geographic_unit_fips  pred_margin  reporting  lower_0.7_margin  upper_0.7_margin  lower_0.9_margin  upper_0.9_margin  results_margin  pred_turnout
0             AL                01001      -9852.0          1           -9852.0           -9852.0           -9852.0           -9852.0         -9852.0       16864.0
1             AL                01003     -48224.0          1          -48224.0          -48224.0          -48224.0          -48224.0        -48224.0       68660.0
2             AL                01005      -1336.0          1           -1336.0           -1336.0           -1336.0           -1336.0         -1336.0        6416.0
3             AL                01007      -3750.0          1           -3750.0           -3750.0           -3750.0           -3750.0         -3750.0        5610.0
4             AL                01009     -13945.0          1          -13945.0          -13945.0          -13945.0          -13945.0        -13945.0       15797.0
...          ...                  ...          ...        ...               ...               ...               ...               ...             ...           ...
2077          WY                56037      -6956.0          1           -6956.0           -6956.0           -6956.0           -6956.0         -6956.0       10698.0
2078          WY                56039        -17.0          1             -17.0             -17.0             -17.0             -17.0           -17.0        9869.0
2079          WY                56041      -4469.0          1           -4469.0           -4469.0           -4469.0           -4469.0         -4469.0        5977.0
2080          WY                56043      -2198.0          1           -2198.0           -2198.0           -2198.0           -2198.0         -2198.0        2732.0
2081          WY                56045      -1855.0          1           -1855.0           -1855.0           -1855.0           -1855.0         -1855.0        2211.0

[3125 rows x 10 columns] 

nat_sum_data 
   estimand  agg_pred  agg_lower  agg_upper
0   margin      17.0       17.0       17.0 

Is this ok? 🤔

lennybronner commented 1 month ago

yes, I am an absolute idiot

dmnapolitano commented 1 month ago

yes, I am an absolute idiot

No you're not it's ok! 😂 🎉 ❤️