Closed benubah closed 2 years ago
Thanks @benubah, 1. looks good to me.
Re 2. But shouldn't avg_
and sum_
be identical is we summarize over one day only? I used avg_
because I thought there may be a case where we have multiple entries per day (we shouldn't) and averaging seems more reasonable then summing up. If you can confirm the two are the same I am fine with your change, just want to understand your reasoning.
Re 2. But shouldn't
avg_
andsum_
be identical is we summarize over one day only? I usedavg_
because I thought there may be a case where we have multiple entries per day (we shouldn't) and averaging seems more reasonable then summing up. If you can confirm the two are the same I am fine with your change, just want to understand your reasoning.
avg_cap_
and sum_cap_
are the same, while avg_all_
and sum_all_
are not the same. And at the end, we need all_new_ = cap_new_ * pop
. This seems to work when we use sum_
and drop avg_
To harmonize our data everywhere we need this PR:
data_all.csv
by >1MBavg_
values and usesum_
values instead - we need the sums instead of averages for groups indata_all.csv
because we are summarizing for a single day.