covid19-dash / covid-dashboard

Help welcomed if you have expertise in public health web technology, data modeling and munging, or visualization.
https://covid19-dash.github.io/
BSD 3-Clause "New" or "Revised" License
131 stars 41 forks source link

Active cases vs new cases #91

Open nikohansen opened 4 years ago

nikohansen commented 4 years ago

I wonder what lead the choice to show active cases rather than new cases. I understand that active cases reflect the current problem at this point in time. New cases however have, AFAICS, two possibly more important features:

Roughly speaking, active cases are the less noisy but lagging indicator, while new cases are the more noisy but a leading indicator. I see more advantages in showing new rather than active cases, but why not show at least both?

I saw that

Unfortunately, they no longer report recovered, hence we cannot plot active, just confirmed

which would make it even more relevant to show new cases. AFAICS, cumulated confirmed cases do not really reflect any relevant aspect of the problem if we do not know the ratio of recovered cases and if the number of cases is small compared to the overall population (that is, if there cannot be a relevant effect on population immunity). Am I missing something?

GaelVaroquaux commented 4 years ago

I wonder what lead the choice to show active cases rather than new cases.

The goal was to show the "active", because this are spreading the pathogen. Anyhow, we can no longer show them, and we can only show the total confirmed.

New cases are very noisy. We would probably need to apply a temporal smoothing.

nikohansen commented 4 years ago

Maybe you overlooked it, but this is how new cases and smoothed new cases (with window size three) look:

http://www.cmap.polytechnique.fr/~nikolaus.hansen/covid19-figure2.html

The noise level seems certainly not to be detrimental when considering the graph as a whole (versus only single data points which should not be considered reliable on their own).

The main problem as I see it isn't really noise, but that total cases carry less and less meaningful information to begin with (like when comparing, say, China total cases vs US total cases at this point in time).