aatishb / covidtrends

Tracking the growth of COVID-19 Cases worldwide
https://aatishb.com/covidtrends/
MIT License
301 stars 107 forks source link

Per capita #1

Open ledahulevogyre opened 4 years ago

ledahulevogyre commented 4 years ago

Hi.

Why not using number of cases per capita ?

Great tool! Thanks.

a-lakhani commented 4 years ago

Per capita data would mainly change how far countries get from the origin. It wouldn't change the point of whether any particular country has successfully slowed the growth of COVID19 cases since both the x and y data for each country would be divided by the same denominator.

ledahulevogyre commented 4 years ago

@a-lakhani of course. But it would ease the analysis for small countries.

a-lakhani commented 4 years ago

This Google sheet plots the per million numbers from 3/19/20.

The population calculations can be used to make a new version of @aatishb 's visualization, but I don't know how to myself.

https://docs.google.com/spreadsheets/d/1EeGTgamjjIc9a2leKsbaCclEh4E4irxYxHJ0-yz_p9s/edit?usp=drivesdk

daald commented 4 years ago

I shortly hacked something, using the population data from the linked document. The code is not nice though, the population data is now embedded in the script instead of loaded from an external source, and I didn't check the data quality. If you enable the javascript console, you also see a number of countries which didn't match for various reasons (different political view, different names, ..). I did only few adjustments to the most obvious mismatches (UK vs United Kingdom etc.), far away from complete.

You can find the code here: https://github.com/daald/covidtrends/tree/perCapita

And a demo here: http://htmlpreview.github.io/?https://raw.githubusercontent.com/daald/covidtrends/perCapita/index.html

I don't think I will continue this work, but maybe someone wants to improve it.

I will not create a merge request since this would demolish the author's original (briliant!) work.

Btw: the fact that china is now on the far left is because the infection happened only in a small area, compared the full China. This could also be interpreted that the pandemy in the (bigger) rest of China didn't even start yet...

a-lakhani commented 4 years ago

Thanks for hacking this together! I see you hard coded population sizes in vue-definitions.js. Does that file contain the code which pulls the covid19 case data from an external source?

Is the site pulling data directly from the Hopkins CSSE repo in order to stay updated?

Btw: the fact that china is now on the far left is because the infection happened only in a small area, compared the full China. This could also be interpreted that the pandemy in the (bigger) rest of China didn't even start yet...

Hopefully it means they were able to limit the spread with quick aggressive action, but we shall see.

daald commented 4 years ago

Thanks for hacking this together! I see you hard coded population sizes in vue-definitions.js. Does that file contain the code which pulls the covid19 case data from an external source?

Is the site pulling data directly from the Hopkins CSSE repo in order to stay updated?

The rest of the code is untouched, the original source was https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series and is still used for the time series, just devided by population

[China]

Hopefully it means they were able to limit the spread with quick aggressive action, but we shall see.

At least they should be aware of the "problem". But it's like a big forest at the end of the summer: a little fire and they have to see it first and then react, while the rest of the world will probably already have some immunity..

stofee commented 4 years ago

daald, do you mind changing the labels on the axis to %? I think many ppl have hard time understanding 10K/1M capita, but if they just see 0.1% 1% 10% etc, they know what the graph shows.

stofee commented 4 years ago

also before you merge it back (I hope that is the end-goal with a switch between the orig, and the per capita version, the threshold for countries with at least 50 is I think accidentally changed to 50/1M capita, I think the ones lower than this ratio, but having at least 50 cases do deserve a line, if the user wants them

daald commented 4 years ago

daald, do you mind changing the labels on the axis to %?

also before you merge it back (I hope that is the end-goal with a switch between the orig [...]

vs

I will not create a merge request since this would demolish the author's original (briliant!) work.

I changed the minCasesInCountry, but I don't think that I will do any further changes. That's because I reached 90% of what i wanted to see, and the rest 10% are hard work. For example, changing to % would mean changing the datatype to float and this didn't work so far. And to make it really mergable, the country sizes should be loaded from an external source which has to be found and adjusted first, and as you said, there must be a switch. But you are free to fork it.

stofee commented 4 years ago

daald, do you mind changing the labels on the axis to %? also before you merge it back (I hope that is the end-goal with a switch between the orig [...]

vs

I will not create a merge request since this would demolish the author's original (briliant!) work.

I did not noticed this sentence, but I also disagree with it, forking and having multiple sites published causes friction, and that can be more harmful long term. Also afaik (never did it on github) there is a review process on the pull request, and the original author can deny it, if they agree with you.

I changed the minCasesInCountry, but I don't think that I will do any further changes. That's because I reached 90% of what i wanted to see, and the rest 10% are hard work. For example, changing to % would mean changing the datatype to float and this didn't work so far. And to make it really mergable, the country sizes should be loaded from an external source which has to be found and adjusted first, and as you said, there must be a switch. But you are free to fork it.

I did a fork, and made all your changes optional, so one can switch between the original version, and the per capita case, later being the default, so it is at least possible to merge back, if you are ok with that.

daald commented 4 years ago

if you are ok with that.

Sure, go for it ;)

aatishb commented 4 years ago

See also #30

mmcguinn commented 4 years ago

@daald @stofee I started working on this independently as it had been itching at me, ended up referencing both your works partway through (after I found this issue) to get an idea where some things were. I forked the current master which includes the per-state/province feature added in the meantime, and I did my best to pull good numbers for them all. Everything is sitting on a wip branch in my fork for the moment:

example: http://htmlpreview.github.io/?https://raw.githubusercontent.com/mmcguinn/covidtrends/wip/index.html branch: https://github.com/mmcguinn/covidtrends/tree/wip

I know overall there has been a lot of disagreement on if this should be added to the mainline, but figured I could at least toss an updated PoC with complete population numbers in here.

ClausAngermeier commented 4 years ago

Hi folks,

Any chance that there is also a version of the „per million“ for death? I only saw one for confirmed cases (which is already great), but the death are a little more eye-opening when it comes to comparing country strategies (e.g. Sweden versus Poland).

robertgalambos commented 4 years ago

See some plots about normalization via population size. https://github.com/aatishb/covidtrends/issues/29#issuecomment-633016196 I also think it would make sens to normalize to be able to compare.

timmc commented 4 years ago

It wouldn't change the point of whether any particular country has successfully slowed the growth of COVID19 cases

@a-lakhani That's actually why it would be a great feature -- it wouldn't interfere with the existing graphs, but would additionally make different countries' responses more comparable.

For example, looking at this unnormalized data from the current site, you can see that the US is doing really poorly, and India and Brazil are as well: absolute

But those are all large countries! If we normalize by population (using @daald's fork) it's clear that India has actually done pretty well relative to population as compared to other countries; it's way back there near the origin. Spain, on the other hand, has now become very visible. It was hard to see in the other graph, but they had a lot of difficulty with containment. per-capita

(Note that these two screenshots do not include all of the same countries; they came with different preselected sets of countries, and I only made sure to include those four countries on both for illustration.)

rpkoller commented 4 years ago

@timmc "...India has actually done pretty well relative to population as compared to other countries" ? if you take a look at the curve https://aatishb.com/covidtrends/?location=India india never left the exponential growth and if you take in consideration that they had strict regulations for people not leaving their homes that curve looks even worse in my humble opinion. worrying to be honest. and @aatishb wrote in https://github.com/aatishb/covidtrends/issues/30#issuecomment-607927252 about his thoughts about the normalization per capita. i can second his opinion. i am not really comfortable about the per capita values and the bias and narrative it might entail as well as simply the purpose of comparing countries (which arent comparable, different counting, different sampling effort even over time in the same country and so forth). the only thing that counts imho is the containment in each country, no matter if you take the smallest country with a few thousand citizens or you look at china in contrast, and that is perfectly visualized, as good as the testnumbers allow in the particular country, in the current iteration of covidtrends.

robertgalambos commented 4 years ago

@rpkoller Data can be misinterpreted in many forms. As with the current graph, in small countries it seems that "everything is fine" however it is not. The most important thing is to check the available data from as many angles as possible. One angle is the normalized value by population. Which by the way correlates with how likely it is to get infected in that country.