covid19-dash / covid-dashboard

Help welcomed if you have expertise in public health web technology, data modeling and munging, or visualization.
https://covid19-dash.github.io/
BSD 3-Clause "New" or "Revised" License
131 stars 41 forks source link

Plot rates by default in addition to numbers? #8

Closed poldrack closed 4 years ago

poldrack commented 4 years ago

there are various ways in which one might normalize the count data that could be relevant to understanding and prediction. for example, per capita, per # of hosiptal beds, per square mile, etc.

GaelVaroquaux commented 4 years ago

There are two aspects:

GaelVaroquaux commented 4 years ago

@emmanuelle : you had an idea on where to get the data (population of each country) for this, I think.

GaelVaroquaux commented 4 years ago

I think that this will do the trick: https://download.geonames.org/export/dump/countryInfo.txt

emmanuelle commented 4 years ago

There is https://population.un.org/wpp/

emmanuelle commented 4 years ago

and https://www.gapminder.org/data/

GaelVaroquaux commented 4 years ago

I've merged #39 which normalizes the map by population.

I am wondering: should we also normalize the time series? Unlike with the map, we cannot keep all the information to have the un-normalized version in the hover (too much to store). @poldrack : any opinion?

Also, @poldrack : any advice on how to name the corresponding quantity in a way that is easy to understand for the non specialist? I wrote "Active cases per Million". I'm not sure that this is good English.

poldrack commented 4 years ago

I'm fine with leaving the timeseries un-normalized. could be useful to normalize it but not essential. I think your wording of "active cases per million" is fine, assuming that this is actually what the data refer to. is that what they are called in the JHU dataset?

GaelVaroquaux commented 4 years ago

is that what they are called in the JHU dataset?

The JHU dataset that I have doesn't give this. I computed them myself.

Thanks!

poldrack commented 4 years ago

can you tell me exactly how?

On Tue, Mar 17, 2020 at 11:58 AM Gael Varoquaux notifications@github.com wrote:

is that what they are called in the JHU dataset?

The JHU dataset that I have doesn't give this. I computed them myself.

Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/covid19-dash/covid-dashboard/issues/8#issuecomment-600243082, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGUVEHRPUAKF6YH4PBUK53RH7B5FANCNFSM4LJUBKEA .

-- Russell A. Poldrack Albert Ray Lang Professor of Psychology Professor (by courtesy) of Computer Science Bldg. 420, Jordan Hall Stanford University Stanford, CA 94305

poldrack@stanford.edu http://www.poldracklab.org/

GaelVaroquaux commented 4 years ago

The relevant lines are: https://github.com/covid19-dash/covid-dashboard/blob/master/make_figures.py#L29

poldrack commented 4 years ago

so you are not removing recovered cases and deaths, correct? in that case, the more appropriate term would be "cumulative cases"

On Tue, Mar 17, 2020 at 12:07 PM Gael Varoquaux notifications@github.com wrote:

The relevant lines are:

https://github.com/covid19-dash/covid-dashboard/blob/master/make_figures.py#L29

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/covid19-dash/covid-dashboard/issues/8#issuecomment-600246837, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGUVECRU2YW4TC4IYVVOXLRH7C5ZANCNFSM4LJUBKEA .

-- Russell A. Poldrack Albert Ray Lang Professor of Psychology Professor (by courtesy) of Computer Science Bldg. 420, Jordan Hall Stanford University Stanford, CA 94305

poldrack@stanford.edu http://www.poldracklab.org/

GaelVaroquaux commented 4 years ago

so you are not removing recovered cases and deaths, correct? in that case, the more appropriate term would be "cumulative cases"

Everything that we are plotting is "active": removing recovered and fatalities: https://github.com/covid19-dash/covid-dashboard/blob/master/data_input.py#L40

poldrack commented 4 years ago

ah, sorry I missed that - so yes, "active cases" is fine then

On Tue, Mar 17, 2020 at 12:19 PM Gael Varoquaux notifications@github.com wrote:

so you are not removing recovered cases and deaths, correct? in that case, the more appropriate term would be "cumulative cases"

Everything that we are plotting is "active": removing recovered and fatalities:

https://github.com/covid19-dash/covid-dashboard/blob/master/data_input.py#L40

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/covid19-dash/covid-dashboard/issues/8#issuecomment-600252005, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGUVEAMVFF2SS3KSHCKOJLRH7EKRANCNFSM4LJUBKEA .

-- Russell A. Poldrack Albert Ray Lang Professor of Psychology Professor (by courtesy) of Computer Science Bldg. 420, Jordan Hall Stanford University Stanford, CA 94305

poldrack@stanford.edu http://www.poldracklab.org/

GaelVaroquaux commented 4 years ago

I think that this is fixed. Closing.

GaelVaroquaux commented 4 years ago

Rethinking about this, I think that one of the main way people use this site is to compare countries. Hence, I am reopening, and I am in favor of plotting the normalized data in the time series plot.

We cannot display both information, because I fear that it would make the website slower, and it is currently quite slow: there is a big delay between clicking to select a country and the update. This delay makes using the website awkward.

What do people think? @poldrack @emmanuelle

poldrack commented 4 years ago

yes, I think that cases per capita is much more interpretable and relevant to things like hospital capacity

On Wed, Mar 25, 2020 at 7:42 AM Gael Varoquaux notifications@github.com wrote:

Rethinking about this, I think that one of the main way people use this site is to compare countries. Hence, I am reopening, and I am in favor of plotting the normalized data in the time series plot.

We cannot display both information, because I fear that it would make the website slower, and it is currently quite slow: there is a big delay between clicking to select a country and the update. This delay makes using the website awkward.

What do people think? @poldrack https://github.com/poldrack @emmanuelle https://github.com/emmanuelle

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/covid19-dash/covid-dashboard/issues/8#issuecomment-603878878, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGUVEGPOV5SDYUPZV76FSDRJIJ63ANCNFSM4LJUBKEA .

-- Russell A. Poldrack Albert Ray Lang Professor of Psychology Professor (by courtesy) of Computer Science Bldg. 420, Jordan Hall Stanford University Stanford, CA 94305

poldrack@stanford.edu http://www.poldracklab.org/

GaelVaroquaux commented 4 years ago

OK, I'm on it.