Closed annapowellsmith closed 4 years ago
Thanks for reaching out.
I'm afraid this issue falls outside of the dev team's domain. I will share it with our data team, but I suggest you use the email on the website to get in touch too.
FYI - @statsgeekclare @PHEgeorginaanderson
Thanks. I sent a link to this GitHub issue to coronavirus-tracker@phe.gov.uk as suggested.
This is the reply I got by email:
Please see also ‘About the data’. In the chart ‘by nation’, the additional pillar 2 data that could retrospectively not be assigned to an exact date has been entered at the mid-point of the related period (15th June=14th July). At the same time a deduplication (to count people once even if there are multiple tests) taking effect on 2nd July reduced the number of cases and shows as this decrease in the chart Cases by date reported. I appreciate that this may in places not present itself in a completely consistent way but I’m sure you’ll understand that the co-ordination of four nations’ data where definitions and methods are still evolving and not always perfectly matching is a challenge that required some adjustments in the process.
I'm not sure I fully understand this, but sharing anyway in case it helps anyone.
What I expected to see
I expected the "Cases by date reported" chart of cases across the UK to show the same total numbers as the "Cases by date reported, by nation" chart, since the second is presumably just a more granular version of the first.
If the two differed, I expected a clear explanation as to why, and the process used to record each set of numbers.
What I actually saw
The UK totals do not match the sum of cases by nation . For example, on 6 May 2020, the new cases reported for the UK were 6,111, but the sum of new cases reported by each nation was 1611+53+272+95=2031.
Also, the chart of cumulative "Cases by date reported" for the UK looks like this:
But the chart of cumulative "Case by date reported, by nation" for each devolved nation looks like this:
Clearly, there is some major difference in recording behaviour.
I tried to understand this by looking at the chart notes and documentation. These did not explain the difference.
The UK-wide "Cases by date reported" chart has an accompanying note:
And the more granular "Cases by date reported, by nation" chart has an accompanying note:
Firstly, it is unclear whether the first note applies also to the the second chart, and vice versa.
Secondly, it is still unclear why the numbers in the two charts do not match. The about page does not clarify the issue.
Requested solution
Please explain clearly exactly which positive test results have been recorded on each chart, at each date, and why the numbers are different.
This will be most helpfully done by documenting the qualifying criteria used to create each set of numbers, at each date, in each case.