signaturescience / focustools

Forecasting COVID-19 in the US
https://signaturescience.github.io/focustools/
GNU General Public License v3.0
0 stars 0 forks source link

function to retrieve vaccination data fixes #35 #41

Closed stephenturner closed 3 years ago

stephenturner commented 3 years ago

This adds a function to retrieve.R and does not modify any other code.

The availability of vaccine data is limited at this point, and I'd like to revisit this at some point later.

This get_vax.R function is pulling from Our World in Data, which is CC-BY, and summarizing to epiyear and epiweek. We currently only have a few weeks worth of data, but we have it on US, States, others (see ?get_vax). Attribution is given in the function documentation.

Some stats on US, Tx, VA:

image

location epiyear epiweek total_distributed total_vaccinations proportion_vaccinations_given people_vaccinated people_fully_vaccinated population total_vaccinations_percap
US 2021 2 118865525 43033771 0.36 35700954 3413012 328728466 0.13
US 2021 3 186415175 89425192 0.48 76552939 12364322 328728466 0.27
US 2021 4 321990525 176432771 0.55 146965706 28149837 328728466 0.54
US 2021 5 49933250 31123299 0.62 25201143 5657142 328728466 0.09
48 2021 2 8144550 4093369 0.50 3591019 372942 28995881 0.14
48 2021 3 13366775 7792936 0.58 6730994 1055253 28995881 0.27
48 2021 4 23822000 14127105 0.59 11721184 2394578 28995881 0.49
48 2021 5 3659550 2408879 0.66 1928227 478812 28995881 0.08
51 2021 2 3348500 870513 0.26 407650 28164 8535519 0.10
51 2021 3 4945500 1863097 0.38 1644667 208269 8535519 0.22
51 2021 4 8081850 4337577 0.54 3630202 563405 8535519 0.51
51 2021 5 1232350 833221 0.68 687883 118612 8535519 0.10

closes #35

vpnagraj commented 3 years ago

errrr maybe im missing something here ... but im not understanding this output. @stephenturner i need you to walk me through this before i merge.

are these numbers cumulative? or "incident" ??

take a look at 2021 epiweek 4 (row 3 above) ... "people_vaccinated" in the US is 146965706 ... even if that's cumulative ... that's 44% of the US population!

like i said i might be misinterpreting the columns. but if so that's an issue we should resolve too. maybe by simplifying the output of get_vax() to look more like get_cases() / get_deaths():

vax <- get_vax()
vax %>% 
  mutate(partial = people_vaccinated - people_fully_vaccinated) %>%
  rename(full = people_fully_vaccinated) %>%
  select(location, epiyear, epiweek, ipartial = partial, ifull = full) %>%
  group_by(location) %>%
  mutate(cpartial = cumsum(ipartial),
         cfull = cumsum(ifull)) %>%
  head(10) %>%
  yawp::more() %>%
  knitr::kable()
location epiyear epiweek ipartial ifull cpartial cfull
US 2021 2 32287942 3413012 32287942 3413012
US 2021 3 64188617 12364322 96476559 15777334
US 2021 4 118815869 28149837 215292428 43927171
US 2021 5 19544001 5657142 234836429 49584313
01 2021 2 283086 30003 283086 30003
01 2021 3 640224 107909 923310 137912
01 2021 4 1417602 259488 2340912 397400
01 2021 5 242970 55331 2583882 452731
02 2021 2 101109 20972 101109 20972
02 2021 3 248709 63361 349818 84333
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .

^ those numbers are off. but might be worth using that format? idk.