Closed dmaymudes closed 4 years ago
Very cool! So the main issue is that we need to normalize location names between JHU CSSE and CDS data sources. This is something we should do for other reasons, too, because we can get much higher-quality data if we merge the historical CSSE and CDS time series data.
Captured the things that need to be done in issue #34.
I normalized the names in the population file so at least for countries, US states, and US counties they match what the graph data uses and setting norm=permillion appears to work for the cases I tested.
the timeseries-byLocation.json file now has more entries than it used to, so there are now 9187 entries with populations and 824 without.
It would no longer be a terrible idea to accept this pull request, but let me know if you want further changes.
(or accept it and then make them yourself, if you'd rather)
I see that in #5 the referenced graph is in "percent" rather than "per million", let me know if you think that's better.
Sorry for the slow reply. A few requests
Please don't accidentally accept this pull request--it doesn't actually work, because the region label in the series passed to apply_norm isn't in the same format as the population labels I have.
Let me know what you suggest; I could figure out how to change the population.js file to have labels like "Switzerland" and "MA" rather than "SWI" and "MA, US", or I could figure out a way to plumb the more canonical labels through.
I also see that timeseries-byLocation.json has "Barnstable County, MA, US" where whatever datasource the current version of the chart uses has "Barnstable, MA" so maybe further cleanup would be required.
I guess nothing so terrible would happen if you did accept it, because it doesn't do anything yet without norm=permillion.