sfbrigade / data-covid19-sfbayarea

Manual and automated processes of sourcing data for the stop-covid19-sfbayarea project
MIT License
8 stars 10 forks source link

HOTFIX: Contra Costa no longer has full history #217

Closed Mr0grog closed 3 years ago

Mr0grog commented 3 years ago

The Contra Costa dashboard no longer shows data that covers a complete history of cases, so the first record has a higher cumulative number of cases than just the cases on that day. We have a sanity checking/reconciliation process that checks that, which is now causing the scraper to fail. This fixes the situation by no longer trying to reconcile the daily and cumulative cases on the first day of the dataset. (This also fixes an error we encounter while trying to report this failure: we checked the wrong property name for the date.)

This fixes the following error we are currently seeing in Slack:

Contra Costa county failed: 'day'
Traceback (most recent call last):
 File "/home/runner/work/stop-covid19-sfbayarea/stop-covid19-sfbayarea/scraper/scraper_data.py", line 30, in main
   out[county] = data_scrapers.scrapers[county].get_county()
 File "/home/runner/work/stop-covid19-sfbayarea/stop-covid19-sfbayarea/scraper/covid19_sfbayarea/data/contra_costa.py", line 112, in get_county
   'cases': get_timeseries_cases(api),
 File "/home/runner/work/stop-covid19-sfbayarea/stop-covid19-sfbayarea/scraper/covid19_sfbayarea/data/contra_costa.py", line 157, in get_timeseries_cases
   raise FormatError(f'Sum of daily cases != cumul_cases at record {index} (date: {record["day"]})')
KeyError: 'day'

Also see the GitHub actions logs directly: https://github.com/sfbrigade/stop-covid19-sfbayarea/runs/2913050076?check_suite_focus=true

Mr0grog commented 3 years ago

Going ahead and merging this hotfix so that tonight’s data run succeeds.