Open boogheta opened 4 years ago
Ah, this is because the number of confirmed cases for England is being updated with historic data from https://coronavirus.data.gov.uk/downloads/csv/coronavirus-cases_latest.csv, whereas there are no revised historic figures for the UK in that feed.
However, the England confirmed cases figures should be complete now, so there shouldn't be a reason to compute (UK - Wales - Scotland -NI).
(I wonder if we should set the UK confirmed cases total to be the sum of the four nations values.)
OK this is what I guessed: I stopped using the substraction, except for Tests which are not included in the England totals files. Thanks for the quick return. Feel free to close this issue or leave it open if you want to change a few related things
Hello Tom,
Many thanks for making the UK data on Covid-19 tests, confirmed cases and deaths publicly available.
Following up on the issue raised here above, I have compared the data series (tests, confirmedcases and deaths) reported in the file covid-19-totals-uk with those obtained aggregating corresponding files across the four countries (Eng, Wal, Sct and Nir). I have found the following: 1) Over the whole time period, the number of tests is always higher in totals-uk. This is understandable since totals-eng does not report the number of tests. 2) The number of deaths is always higher when aggregating the four countries together. This is probably due to the fact that the series for the four countries have been updated but the ones for UK totals have not (as you explained here above). 3) What is puzzling me is the following: the number of confirmed cases is higher when aggregating the four countries only up to 18 April 2020. From 19 April 2020, the number reported by totals-uk is higher. Why is this the case? Many thanks for your help. Giulia
For 3. UK confirmed cases is not being revised, as I mentioned above, so it doesn't equal the sum of the totals for the four nations. I don't know what happened on 18 April.
Thanks for your reply.
I looked into this more, and it looks like the UK totals (for confirmed cases) being higher than the sum of the four nations (from April 11 onwards) is due to the "pillar 2" tests, for which no location is reported. I wrote about this more here: http://tom-e-white.com/datavision/20-where-are-the-coronavirus-cases.html
Thanks very much.
I noticed that there is also a mismatch on the 25th April between the total number of tests in the UK reported cumulatively (in the covid-19-totals-uk.csv
file) and the "DailyPeopleTested" in the covid-19-tests-uk.csv
file. More precisely, the cumulative total of tests increased by around 70000 units on that day, while the value of "DailyPeopleTested" for the same day was around 20000. For the other days the two numbers mostly agree, even if I also found some other small mismatches on the 1st and 7th May.
I believe these increases in the cumulative are due to revising total due to late reporting, as other people mentioned above. I just wanted to report that, in case somebody else will notice that in the future.
Hi Tom,
I have just seen the great job you have done in creating the file covid-19-tests-uk.csv. Thanks a lot! Really appreciated.
Since I am working on the files you have created right now, I have found two discrepancies between the totals reported in covid-19-tests-uk.csv and covid-19-total-uk.csv. I am pointing this out only because it can help someone else, not for being picky!
When comparing the series 'tests' in covid-19-total-uk.csv with the series 'TotalPeopleTested' in covid-19-tests-uk.csv, all figures matched expect those on April 12. In covid-19-total-uk.csv, it is 95 units lower.
When comparing the series 'confirmed' in covid-19-total-uk.csv with the series 'TotalPositive' in covid-19-tests-uk.csv, all figures matched expect those on April 10. In covid-19-total-uk.csv, it is 3486 units lower. And this might explain the big spike in confirmed cases that we see on April 11 (pointed out before).
I assure you that we are doing interesting things with all the data you have created. Thanks again, Giulia
Hello,
I've been reusing your great work on UK data within my dashboard here: https://boogheta.github.io/coronavirus-countries/#country=UK
While adding Tests data since you completed it across all countries, I encountered something that looks to me like an error but I might read things wrong: When looking at confirmed cases for whole UK https://github.com/tomwhite/covid-19-uk-data/blob/master/data/covid-19-totals-uk.csv and for just England https://github.com/tomwhite/covid-19-uk-data/blob/master/data/covid-19-totals-england.csv, there appears to be greater values for just England than for the whole UK until April 10th.
I realised it because I'm completing England figures by doing UK - Wales - Scotland - Eire whenever an England figure is missing since others are all complete but maybe I'm misunderstanding something?