epiforecasts / covidregionaldata

An interface to subnational and national level COVID-19 data. For all countries supported, this includes a daily time-series of cases. Wherever available we also provide data on deaths, hospitalisations, and tests. National level data is also supported using a range of data sources as well as linelist data and links to intervention data sets.
https://epiforecasts.io/covidregionaldata/
Other
37 stars 18 forks source link

rewrite get_regional_data so it uses Google when there is no country-specific class #406 #435

Open RichardMN opened 2 years ago

RichardMN commented 2 years ago

This is an attempt to write try_regional_data which should be a drop-in replacement for get_regional_data, taking the same arguments and providing the same form of output, but with the additional feature that when there is no country-specific class, it checks to see if Google has data for the country and if there is data, uses the Google class to download and present the data.

try_regional_data will always use the country-specific data if it exists.

It does not [yet] amend the Google class output to make it look more like one of the country-specific classes' output.

It looks into the Google class to peek at the list of data available from Google without instantiating the class unnecessarily.

RichardMN commented 2 years ago

Do you think it makes sense to have this support in a separate function or should we instead just port everything into get_regional_data? Seems like pretty nice functionality that people will want and so I imagine everyone will just switch to try_... otherwise.

So I've done a bit more thinking on this and I would propose:

I think I can start applying this approach later this week. With help it may be ready for 0.9.3. It feels like a big potential change to one of the main published interfaces but it should also cover up for data sources which no longer work and where we don't see another country-specific source available (e.g. India #430) and extend the availability to many more countries (e.g. Spain, but not Furgleburf).

seabbs commented 2 years ago

So I've done a bit more thinking on this and I would propose:

  • much of this gets ported into get_regionaldata, so that get uses a class if it's available and then goes looking for alternatives
  • the Google[/JHU] portion of the code should work to make its output more consistent with the country-specific classes: currently it adjusts the level number (adding one) but it should probably also clear out some columns (country name) which don't make sense when being used for a specific country
  • there should be a parameter (with a sensible default) to specify whether to try non-Class results, or perhaps to prefer non-Class results. I'm not sure whether this is try_elsewhere = "first" / "second" / "never" or just a TRUE / FALSE flag, but whatever it is, the behaviour for countries where we have country-specific Classes shouldn't require a new parameter in calling code to get the same output.

These are all great ideas. I'd suggest we push this to a later update so we can fix the package on CRAN.

github-actions[bot] commented 2 years ago

This PR has been flagged as stale due to lack of activity