Closed Bisaloo closed 2 years ago
I added Switzerland, having found what appeared to be a fairly comprehensive and reputable data source, but I've not compared it with others and I don't think I'm qualified to judge between them.
If someone says that we should switch to another then I can look at replacing the back-end - and this could include studying @kathsherratt's dark magic on the UK code. (I think I ported the UK code to the new R6 system and I still find it baffling.)
It would probably be fairly straightforward to just use a different source for a subset of our data columns. I've found some new sources for Lithuania and had pondered adding hospitalization data.
Thanks, both, good to have a discussion on this.
So the first step might be to open an issue on the source repository we are using (https://github.com/openZH/covid_19/issues) . It certainly looks maintained and high quality so there must be solid reasons for the discrepancy. On the other hand, the source @bisaloo has found looks quite official and so seems like something we would want to source data from.
If we can establish some kind of superiority of either source (on the face of it the official source seems like the obvious choice) then switching would seem to make sense. If we can't or if we think there are still good reasons to support both then I would suggest two possible options.
If using both sources I see two options:
In the current class add a new argument (i.e like nhsregions
in the UK
class) and then add custom control code to download and join the data as in the UK.
If the data fully overlaps but is equally valid we would ideally offer a choice of source with two separate documented classes. This gets a bit tricky with how we have set things up but seems in principle doable. I think the way to do this would be to have child classes for Switzerland and then have these be initialised and returned when the Switzerland class is called with that source (so Switzerland_source_name for each child class). That seems like a slightly more general solution for what is perhaps a fairly common problem but has a less than satisfying approach to dispatch.
Anyone make any progress on this?
Looking at this it looks as though we may need a two-stage download method.
The data locations are updated daily and provided in a JSON file available at https://www.covid19.admin.ch/api/data/context
This gives as follows:
"csv": {
"daily": {
"cases": "https://www.covid19.admin.ch/api/data/20211001-z0wrsmyu/sources/COVID19Cases_geoRegion.csv",
"casesVaccPersons": "https://www.covid19.admin.ch/api/data/20211001-z0wrsmyu/sources/COVID19Cases_vaccpersons.csv",
"hosp": "https://www.covid19.admin.ch/api/data/20211001-z0wrsmyu/sources/COVID19Hosp_geoRegion.csv",
"hospVaccPersons": "https://www.covid19.admin.ch/api/data/20211001-z0wrsmyu/sources/COVID19Hosp_vaccpersons.csv",
"death": "https://www.covid19.admin.ch/api/data/20211001-z0wrsmyu/sources/COVID19Death_geoRegion.csv",
"deathVaccPersons": "https://www.covid19.admin.ch/api/data/20211001-z0wrsmyu/sources/COVID19Death_vaccpersons.csv",
"test": "https://www.covid19.admin.ch/api/data/20211001-z0wrsmyu/sources/COVID19Test_geoRegion_all.csv",
"testPcrAntigen": "https://www.covid19.admin.ch/api/data/20211001-z0wrsmyu/sources/COVID19Test_geoRegion_PCR_Antigen.csv",
"hospCapacity": "https://www.covid19.admin.ch/api/data/20211001-z0wrsmyu/sources/COVID19HospCapacity_geoRegion.csv",
"re": "https://www.covid19.admin.ch/api/data/20211001-z0wrsmyu/sources/COVID19Re_geoRegion.csv",
"intCases": "https://www.covid19.admin.ch/api/data/20211001-z0wrsmyu/sources/COVID19IntCases.csv",
"virusVariantsWgs": "https://www.covid19.admin.ch/api/data/20211001-z0wrsmyu/sources/COVID19Variants_wgs.csv",
"covidCertificates": "https://www.covid19.admin.ch/api/data/20211001-z0wrsmyu/sources/COVID19Certificates.csv"
},
So we may need to have something which grabs the JSON, determines which CSVs it wants to download, then downloads them. I've looked at the UK code a bit and I'm currently not sure how to document two different data sources like this - what do we list as the source URL and text? Still, I can imagine how this may be done.
This issue has been flagged as stale due to lack of activity
Related: https://github.com/epiforecasts/covid19-forecast-hub-europe/issues/906
We have been made aware of an alternative data source for Switzerland that gives completely different results than the current one (as least for hospitalisations): https://www.covid19.admin.ch/en/overview.
I'm keen to add this data source to covidregionaldata but I'm not sure what is the best way to handle this situation since there is already one data source and I don't know if / which one is more reliable than the other.
There are several options: