epiforecasts / NCoVUtils

Utility functions for the 2019-NCoV outbreak
https://epiforecasts.io/NCoVUtils/
Other
27 stars 13 forks source link

Data for additional countries. #72

Open seabbs opened 4 years ago

seabbs commented 4 years ago

I think it makes sense to expand to more datasets now.

There has been interest in the following:

Do you have any idea of sources @ffinger?

ffinger commented 4 years ago

I will try to find sources for those asop.

Here's an issue tracking requests for sub-national data in LMIC: https://github.com/reconhub/covid19hub/issues/5

With a dedicated spreadsheet: https://docs.google.com/spreadsheets/d/1uvg07BAmwKqLqhKvkejhkX7uvXiGCre4sz11Au3pz9Q/edit?usp=sharing

briatte commented 4 years ago

Are the countries listed above still of interest?

@ColinFay has coded a few things for Burkina Faso: https://github.com/reconhub/covid19hub/issues/5

… and the data does seem to include confirmed cases (not just contacts). It might take some proofreading, but perhaps this can be achieved (by hand, even, if needed).

Also, @ffinger's spreadsheet mentions Switzerland: openZH has done the job, but would you like a function to get them into NCoVUtils with a function like those the package already includes? What columnes do you need beyond cases and deaths?

ffinger commented 4 years ago

Hi @briatte, For each country, we are looking for the following columns:

If available also the number of newly recovered, the number of tests done, the number of positive and negative test results are very useful.

Ideally regions should be named so that they match a reference geography (if available), for instance provided in rnaturalearth::ne_states() or in a separate reference file.

@seabbs is there any other columns required or ideally provided?

ffinger commented 4 years ago

I added some additional countries and potential data sources in the spreadsheet. The data are all on HDX, but in different formats. It would be great if someone could give those a go:

Links to data sources here:

https://docs.google.com/spreadsheets/d/1uvg07BAmwKqLqhKvkejhkX7uvXiGCre4sz11Au3pz9Q/edit?usp=sharing

ffinger commented 4 years ago

Also consider this source for European countries: https://github.com/ec-jrc/COVID-19/tree/master/data-by-region https://data.humdata.org/dataset/europe-covid-19-subnational-cases

cwallaceh commented 4 years ago

Hi, I'm trying to add Mexico to the list, we are already extracting the data from official sources

kathsherratt commented 4 years ago

Hi @cwallaceh - great, we'd be keen to add Mexico as well. Do you have a link to the data (and/or R code to extract and clean)? No problem if not, we are happy to do this.

I also wanted to flag that we are planning to fully replace NCoVUtils with a new package. This will have all the same functionality as NCoVUtils, and we think it will be much easier to use. It will include all the current, and some new, countries' regional data. However it is not quite CRAN-ready yet so we will advertise it more widely when it is ready.