klenwell / covid-19

Python command-line application to collect and analyze COVID-19 data.
1 stars 0 forks source link

CDPH stopped updating vaccine data on 4/19. #59

Closed klenwell closed 2 years ago

klenwell commented 2 years ago

I noticed this morning that vaccine data has not been updated since 4/19. See 0 values at top of right-most columns here:

klenwell commented 2 years ago

The original CDPH data source now returns 404:

However, the CDPH dashboard page notes some recent reporting changes:

In May 18, 2021, the denominator for calculating vaccine coverage has been changed from age 16+ to age 12+ to reflect new vaccine eligibility criteria. The previous dataset based on age 16+ denominators has been uploaded as an archived table.

Starting on May 29, 2021 the methodology for calculating on-hand inventory in the shipped/delivered/on-hand dataset has changed. Please see the accompanying data dictionary for details. In addition, this dataset is now down to the ZIP code level.

And it looks like this may be the new URL for the vaccine data:

klenwell commented 2 years ago

I've added a __main__ block to run extract independently. Makes testing easier:

$ python covid_app/extracts/cdph/oc_vaccines_daily_extract.py 
--Return--
> /home/klenwell/projects/covid-19/covid_app/extracts/cdph/oc_vaccines_daily_extract.py(124)<module>()->None
-> breakpoint()
(Pdb) extract
<__main__.OcVaccinesDailyExtract object at 0x7f2675938190>
(Pdb) extract.ends_on
datetime.date(2022, 4, 19)
(Pdb) extract.url
'https://data.ca.gov/api/3/action/datastore_search_sql?sql=SELECT%20*%20from%20%22c020ef6b-2116-4775-b11d-9df2875096ab%22%20WHERE%20%22county%22%20LIKE%20%27Orange%27'

Looks like I will be able to replace c020ef6b-2116-4775-b11d-9df2875096ab in URL query string with new id 92172873-f424-4635-ad38-71ea1c9ffcc4.

klenwell commented 2 years ago

Updating the report ID seems to have done the trick. Notice we now get data through May 3:

$ python covid_app/extracts/cdph/oc_vaccines_daily_extract.py 
--Return--
> /home/klenwell/projects/covid-19/covid_app/extracts/cdph/oc_vaccines_daily_extract.py(125)<module>()->None
-> breakpoint()
(Pdb) extract.ends_on
datetime.date(2022, 5, 3)
(Pdb) extract.dated_records.get(extract.ends_on)
{'california_flag': 'California', 'cumulative_pfizer_doses': '3403224', 'cumulative_total_doses': '6096868', 'cumulative_fully_vaccinated': '2305711', 'pfizer_doses': '1656', 'booster_recip_count': '621', 'moderna_doses': '1452', '_full_text': "'-03':4 '-05':3 '12':11 '1264492':21 '1452':9 '162561':12 '1656':7 '200942':14 '2022':2 '2305711':16 '2336355':10 '2506653':18 '258':13 '266':17 '273':15 '3245':5 '3403224':8 '6096868':6 '621':20 'california':19 'orange':1", 'at_least_one_dose': '266', 'county': 'Orange', 'partially_vaccinated': '258', 'total_doses': '3245', 'fully_vaccinated': '273', 'cumulative_booster_recip_count': '1264492', 'jj_doses': '12', 'cumulative_jj_doses': '162561', 'cumulative_at_least_one_dose': '2506653', '_id': 25277, 'administered_date': '2022-05-03', 'total_partially_vaccinated': '200942', 'cumulative_moderna_doses': '2336355'}
(Pdb) extract.boosted[extract.ends_on]
621
klenwell commented 2 years ago

Resolved

Fixed by this PR: https://github.com/klenwell/covid-19/pull/60

Just needed to update the extract ID, which references the data source on the CDPH website.