epiforecasts / covid-rt-estimates

National and subnational estimates of the time-varying reproduction number for Covid-19
https://epiforecasts.io/covid/
MIT License
34 stars 17 forks source link

Old and new regions for cases presents in estimates - UK #76

Closed seabbs closed 4 years ago

seabbs commented 4 years ago

@kathsherratt could you flag the old depreciated UK regions that we should remove from the repository. At the moment they are shown on the website and in all downstream CSVs.

@joeHickson I think the best solution to this is to delete the folders for these regions in order to stop them appearing in the summaries.

kathsherratt commented 4 years ago

Thanks for noticing this Sam, sorry I missed it.

For cases, deaths, and admissions, we need to remove:

For admissions only, we need to remove:

seabbs commented 4 years ago

@joeHickson could you clean these folders out from the production server? I think the Cornish internet is not ready for a 20Gb download and I don't have a VM with this loaded at the mo 😸

seabbs commented 4 years ago

This is getting fairly urgent. Can anyone look at this please?

I see multiple regions being updated for deaths in a way that I don't understand:

https://github.com/epiforecasts/covid-rt-estimates/tree/master/subnational/united-kingdom/deaths/national

@kathsherratt are there duplicate regions in the data source?

I have removed the UK for admissions.

joeHickson commented 4 years ago

Certainly the list at the point it's passed to epinow 2 is different:

2020-10-14 01:01:11 INFO Producing estimates for: East of England, England, London, Midlands, North East and Yorkshire, North West, Northern Ireland, Scotland, South East, South West, Wales, United Kingdom
2020-10-15 00:59:27 INFO Producing estimates for: East Midlands, East of England, England, London, North East, North West, Northern Ireland, Scotland, South East, South West, Wales, West Midlands, Yorkshire and The Humber, United Kingdom

We don't have logging of what locations were in the original data.

joeHickson commented 4 years ago

Is it possible the data source has changed?

kathsherratt commented 4 years ago

The data source seems fine, returning the correct geographies (4 nations + 7 NHS regions) with data present for the correct variables (cases_new, deaths_new, and hosp_new_blend).

data <- covidregionaldata::get_regional_data("uk", nhsregions = TRUE)
unique(data$region)

But it does look like the 2 "new" regions that are only brought in by using the nhsregions = TRUE argument (Midlands; North East & Yorkshire) haven't updated recently, at least for cases (last updated 2 days ago).

Has anything changed with the use of data_args in dataset-list.R? Could it be a change where the argument nhsregions = TRUE doesn't get passed through to covidregionaldata?

joeHickson commented 4 years ago

let see if that reoslves it

joeHickson commented 4 years ago

looks like some more bits might have gone astray with the problematic merge of #73