openZH / covid_19

COVID19 case numbers of Cantons of Switzerland and Principality of Liechtenstein (FL). The data is updated at best once a day (times of collection and update may vary). Start with the README.
https://www.zh.ch/de/gesundheit/coronavirus/zahlen-fakten-covid-19.zhweb-noredirect.zhweb-cache.html?keywords=covid19&keyword=covid19#/
Creative Commons Attribution 4.0 International
424 stars 177 forks source link

Error in the TI datas #460

Closed cenadia closed 4 years ago

cenadia commented 4 years ago

There is an error in the total positive cases of 04.04.20 of Ticino. The real data is 2’442 positive cases and not 2422. The reported number of positives of 05.04.20 is indeed correct.

(see: https://www4.ti.ch/area-media/comunicati/dettaglio-comunicato/?NEWS_ID=187575&tx_tichareamedia_comunicazioni%5Baction%5D=show&tx_tichareamedia_comunicazioni%5Bcontroller%5D=Comunicazioni&cHash=4b3f64139a9dd36b9fd30a1ace23654a)

andreasamsler commented 4 years ago

Checking, thanks

baryluk commented 4 years ago

Looks like a human error during a review in https://github.com/openZH/covid_19/commit/aae4e7ee0555883b24a99798ab88cca53c68e0ae by @viktoria023 ?

TI scraper did scrape 2422:

...
TI 2020-04-04T08:00    2442     165 OK 2020-04-04T11:05:45+02:00 # Extras: ncumul_released=287,ncumul_hosp=370,ncumul_ICU=75,ncumul_vent=67 # URLs: https://www4.ti.ch/dss/dsp/covid19/home/, https://www4.ti.ch/area-media/comunicati/, https://www4.ti.ch/area-media/comunicati/dettaglio-comunicato/?NEWS_ID=187570&tx_tic
...

ncumul_ICU is also different. scraper said 75, but CSV has value 72. It was 72 on 2020-04-05 tho.

Worth double checking with bulletins.

It is also weird, because https://www4.ti.ch/area-media/comunicati/dettaglio-comunicato/?NEWS_ID=187570&tx_tic says 2020-04-03, but the scraper says 2020-04-04.

I think I know what is the problem. The scraper scrapes two pages. One is updated earlier than the other, and the date is newer, but the data is from day before of the second page. I will look how this can be improved to not happen. Filled a bug for this https://github.com/openZH/covid_19/issues/515

simgraworldwide commented 4 years ago

Corrected the row, we're in contact with TI to raise awareness. Hopefully we'll see more easily readable publications soon.