Closed zdavatz closed 4 years ago
Not really. - means we don't have data. 0 means we have data and it is 0. There is a significant difference. When '-', the actual deaths could be anything, we just don't know directly.
Ok, but then you are mixing string
and integer
in that column. So the column is not consistent. Would be great if we could stick to the CSV standard and not mix strings with integers in a column.
Yes, it is consistent. It is either number or -
. This is intentional. You can treat both as a string if you want. Or handle the two cases explicitly.
Can you just leave the space empty then? Would be better then putting a -
for parsing.
What conventions are you using for the output file?
No, I can't because then it is actually harder to parse. It will also break add_db_entry.py
.
Can we then at least stick to a standard CSV delimiter for the values?
@zdavatz How CSV will help? You will still need to handle int or empty field explicitly.
How CSV will help? You will still need to handle int or empty field explicitly.
Yes, but we do not have to handle the string
. Empty
is easier to handle then string
(-).
Parsing empty as int, still will fail, so it requires same amount of handling.
Could you show me snippet of code showing how you do it now, and how would you do it if it is empty?
Oh. I see, you are trying to read it as csv. Well, yes, that will not really work. The output is not csv.
We should just make a secondary output that is more csv-like. Or better yet, use csv files that are in this repo.
well, we just want to parse the up-to-date numbers as quickly as possible. Our Map is updated every hour: http://covid19.ddrobotec.com/
I can start publishing csv files at https://www.functor.xyz/covid_19/scrapers/outputs/latest.csv which is the same data as latest.txt just in csv format.
If that works, I can have it working in an hour or two.
That would be great, sir! Will make our data analysts very happy ;).
@zdavatz Could you take a look at https://www.functor.xyz/covid_19/scrapers/outputs/latest.csv ? It provides same info as latest.txt, without failures.
Once we got https://github.com/openZH/covid_19/pull/275 merged, it will also provide extra information.
@zdavatz I made the data on functor.xyz in https://www.functor.xyz/covid_19/scrapers/outputs/latest.csv working, and also provide few extras (hospitalized, ventilated, recovered/released) for some cantons (AG, VD, UR, ZG, soon ZH, TI, GR and JU).
$ curl 'https://www.functor.xyz/covid_19/scrapers/outputs/latest.csv'
date,time,abbreviation_canton_and_fl,ncumul_tested,ncumul_conf,ncumul_hosp,ncumul_ICU,ncumul_vent,ncumul_released,ncumul_deceased,source
2020-03-26,,SG,,306,,,,,,"Scraper for SG at 2020-03-28T00:31:38+01:00 using https://www.sg.ch/tools/informationen-coronavirus.html"
2020-03-26,,VD,,2532,,,,148,38,"Scraper for VD at 2020-03-28T00:31:59+01:00 using https://api.datawrapper.de/v3/charts/tr5bJ/data"
2020-03-27,16:00,AG,,364,,,9,,3,"Scraper for AG at 2020-03-28T00:30:26+01:00 using https://www.ag.ch/de/themen_1/coronavirus_2/alle_ereignisse/alle_ereignisse_1.jsp"
2020-03-27,18:00,AI,,12,,,,,,"Scraper for AI at 2020-03-28T00:30:29+01:00 using https://www.ai.ch/themen/gesundheit-alter-und-soziales/gesundheitsfoerderung-und-praevention/uebertragbare-krankheiten/coronavirus"
2020-03-27,13:00,AR,,44,,,,,2,"Scraper for AR at 2020-03-28T00:30:31+01:00 using https://www.ar.ch/verwaltung/departement-gesundheit-und-soziales/amt-fuer-gesundheit/informationsseite-coronavirus/"
2020-03-27,,BE,,718,,,,,8,"Scraper for BE at 2020-03-28T00:30:34+01:00 using https://www.besondere-lage.sites.be.ch/besondere-lage_sites/de/index/corona/index.html"
2020-03-27,,BL,,466,,,,,5,"Scraper for BL at 2020-03-28T00:30:35+01:00 using https://www.statistik.bl.ch/files/sites/Grafiken/COVID19/Grafik_COVID19_BL_Linie.htm"
2020-03-27,10:00,BS,,534,,,,,,"Scraper for BS at 2020-03-28T00:30:40+01:00 using https://www.gd.bs.ch/, https://www.gd.bs.ch//nm/2020-tagesbulletin-coronavirus-534-bestaetigte-faelle-im-kanton-basel-stadt-gd.html"
2020-03-27,,FR,,369,,,,,15,"Scraper for FR at 2020-03-28T00:30:44+01:00 using https://www.fr.ch/covid19/sante/covid-19/coronavirus-statistiques-evolution-de-la-situation-dans-le-canton"
2020-03-27,12:00,GE,,1924,,,,,23,"Scraper for GE at 2020-03-28T00:31:01+01:00 using https://www.ge.ch/document/point-coronavirus-maladie-covid-19/telecharger"
2020-03-27,13:30,GL,,44,,,,,,"Scraper for GL at 2020-03-28T00:31:08+01:00 using https://www.gl.ch/verwaltung/finanzen-und-gesundheit/gesundheit/coronavirus.html/4817"
2020-03-27,,GR,,409,,,,,9,"Scraper for GR at 2020-03-28T00:31:16+01:00 using https://www.gr.ch/DE/institutionen/verwaltung/djsg/ga/coronavirus/info/Seiten/Start.aspx"
2020-03-27,16:00,JU,,112,,,,,,"Scraper for JU at 2020-03-28T00:31:20+01:00 using https://www.jura.ch/fr/Autorites/Coronavirus/Accueil/Coronavirus-Informations-officielles-a-la-population-jurassienne.html"
2020-03-27,11:00,LU,,287,,,,,3,"Scraper for LU at 2020-03-28T00:31:23+01:00 using https://gesundheit.lu.ch/themen/Humanmedizin/Infektionskrankheiten/Coronavirus"
2020-03-27,14:00,NE,,287,,,,,5,"Scraper for NE at 2020-03-28T00:31:28+01:00 using https://www.ne.ch/autorites/DFS/SCSP/medecin-cantonal/maladies-vaccinations/Pages/Coronavirus.aspx"
2020-03-27,15:15,NW,,54,,,,,0,"Scraper for NW at 2020-03-28T00:31:31+01:00 using https://www.nw.ch/gesundheitsamtdienste/6044"
2020-03-27,,OW,,37,,,,,,"Scraper for OW at 2020-03-28T00:31:35+01:00 using https://www.ow.ch/de/verwaltung/dienstleistungen/?dienst_id=5962"
2020-03-27,07:30,SH,,36,,,,,,"Scraper for SH at 2020-03-28T00:31:43+01:00 using https://sh.ch/CMS/content.jsp?contentid=3209198&language=DE&_=1584807070095"
2020-03-27,00:00,SO,,157,,,,,1,"Scraper for SO at 2020-03-28T00:31:45+01:00 using https://corona.so.ch/"
2020-03-27,,SZ,,119,,,,32,1,"Scraper for SZ at 2020-03-28T00:31:48+01:00 using https://www.sz.ch/behoerden/information-medien/medienmitteilungen/coronavirus.html/72-416-412-1379-6948"
2020-03-27,,TG,,117,,,,,,"Scraper for TG at 2020-03-28T00:31:52+01:00 using https://www.tg.ch/news/fachdossier-coronavirus.html/10552"
2020-03-27,08:00,TI,,1688,,,,,76,"Scraper for TI at 2020-03-28T00:31:54+01:00 using https://www4.ti.ch/dss/dsp/covid19/home/"
2020-03-27,12:00,UR,,40,,,,3,0,"Scraper for UR at 2020-03-28T00:31:58+01:00 using https://www.ur.ch/themen/2962"
2020-03-27,,VS,,808,,,,,20,"Scraper for VS at 2020-03-28T00:32:06+01:00 using https://www.vs.ch/de/web/coronavirus"
2020-03-27,18:00,ZG,,101,,,,18,1,"Scraper for ZG at 2020-03-28T00:32:14+01:00 using https://www.zg.ch/behoerden/gesundheitsdirektion/amt-fuer-gesundheit/corona"
2020-03-27,09:30,ZH,,1578,,,,,11,"Scraper for ZH at 2020-03-28T00:32:17+01:00 using https://gd.zh.ch/internet/gesundheitsdirektion/de/themen/coronavirus.html"
great, thank you @baryluk 💪🏻👏🤣🥇🚨‼️
@baryluk what is the option to create the CSV file output, as per your link above?
If the python scraper does not find a value for
deaths
please output anint
not astring
. Thank you @barylukAI 2020-03-26T18:00 11 - OK
better would be
AI 2020-03-26T18:00 11 0 OK