Closed mlbrgl closed 1 year ago
From triage: this is basically about the model used to fill missing values. Right now we fill using a closest
strategy, but we could also fill forward
or backward
only, and sometimes that could make sense.
Add this to the overall tolerance project.
@HannahRitchie @maxroser Sophia and I were just chatting about this tolerance issue. Has there been another recent case where it was important for something you were trying to communicate?
@larsyencken Yes, Max, @edomt, and I were discussing a similar problem just a week or two ago.
Many countries are no longer vaccinating for COVID-19 (or very little, or not reporting updates). That means the most recent data point is a month or more in the past. Our COVID vaccination maps can look pretty empty with key countries missing.
The solution to this would be to have a large tolerance, so it picks up the latest data point even if it was in January. But setting this tolerance would also work the other way – countries would show vaccinations long before they started vaccinating.
So ideally we'd want zero tolerance for forwards in time. But large tolerance for going backwards.
Author: @HannahRitchie Author-rated priority: Medium Completed?: No Created on: December 8, 2021 8:49 PM Description: You cannot specify a different tolerance going backwards vs. forwards in time, which can produce misleading results Last modified: December 14, 2021 4:40 PM tags: Author request
It is currently the case that we can set 'tolerance' intervals on maps for what data to show. This is very useful, but doesn't let us distinguish between an upper and lower tolerance i.e. whether to show data forward in time or back in time.
Often we want to show data for the last available date (i.e. dating backwards). But it's often inappropriate to also show this tolerance for the earliest date (i.e. dating forwards).
An example of this is COVID vaccination policies. When we're looking at data for December 8th 2021, I want people to see the latest policy decision for, say, last 40 days. If we set no tolerance then almost no countries show data.
https://ourworldindata.org/grapher/youngest-age-covid-vaccination?time=2021-12-08
But when I set a tolerance of 40 days, it also works at the other end. So on, say, December 17th 2020 it looks like countries had already started vaccinating when they hadn't (e.g. Brazil which started on January 17th 2021 is shown). This is misleading. https://ourworldindata.org/grapher/youngest-age-covid-vaccination?time=2020-12-17
Ideally we would be able to set a different tolerance forwards and backwards in time.