Closed corneliusroemer closed 2 years ago
Hi @corneliusroemer , thank you for finding this issue and providing a detailed report. I can confirm that the G.h cases derived from the 2022-07-19, 2022-07-15, and 2022-06-24 Epidemiological Overviews had a Date_confirmation of the publication date instead of the cutoff described by the UKHSA. I have resolved this issue.
As for the Date_confirmation of 2022-06-26, this was the cutoff date described by the UKHSA in the following report: https://www.gov.uk/government/publications/monkeypox-outbreak-epidemiological-overview/monkeypox-outbreak-epidemiological-overview-28-june-2022.
Thanks!
Excellent, thanks @tvarrelman!
There may be similar issues in other countries' data - would be great to check this systematically. You probably have a list of websites for each country that you check this in.
England has released new case data twice weekly for a while now - always on Tuesday and Friday.
As their publishing schedule is quite regular, I was surprised to find the 7d average case rate plot for the UK to be so spiky:
When I investigated the date of confirmation for the English cases in your Google Sheet, I noticed what causes the spikes: you seem to be inconsistent on the weekdays these English cases get attributed.
Could you double check that you choose consistent weekdays for date of confirmation? That way Spikes in the 7d average plots would disappear.
Thanks a lot!
Here's the pivot table I used to spot the issue:
These are the official dates of the update, you should probably align with these dates:
See https://www.gov.uk/government/news/monkeypox-cases-confirmed-in-england-latest-updates#full-publication-update-history
You mostly seem to have chosen cutoff date used by UKHSA - but sometimes you used report day. It should be easy to fix - just filter to English cases and do some bulk editing. Thanks!