covid19-dash / covid-dashboard

Help welcomed if you have expertise in public health web technology, data modeling and munging, or visualization.
https://covid19-dash.github.io/
BSD 3-Clause "New" or "Revised" License
131 stars 41 forks source link

County vs. state information for USA: total or remainder? #79

Open emmanuelle opened 4 years ago

emmanuelle commented 4 years ago

The Johns Hopkins dataset has information at state or county level in the State / Province column. For some states (eg California) there is information both for the state and for some counties of this state (altough most of the time the numbers are 0 at county level). It is not clear how the group by should be performed in this case.

emmanuelle commented 4 years ago

For example

In [50]: df[df['Province/State'].str[-2:] == 'NY']                                                                                  
Out[50]: 
             Province/State Country/Region      Lat     Long  1/22/20  ...  3/18/20  3/19/20  3/20/20  3/21/20  3/22/20
274      Suffolk County, NY             US  40.9849 -72.6151        0  ...        0        0        0        0        0
275       Ulster County, NY             US  41.8586 -74.3118        0  ...        0        0        0        0        0
285     Rockland County, NY             US  41.1489 -73.9830        0  ...        0        0        0        0        0
286     Saratoga County, NY             US  43.0324 -73.9360        0  ...        0        0        0        0        0
308       Nassau County, NY             US  40.6546 -73.5594        0  ...        0        0        0        0        0
319     New York County, NY             US  40.7128 -74.0060        0  ...        0        0        0        0        0
332  Westchester County, NY             US  41.1220 -73.7949        0  ...        0        0        0        0        0

[7 rows x 65 columns]

In [51]: df[df['Province/State'] == 'New York']                                                                                     
Out[51]: 
   Province/State Country/Region      Lat     Long  1/22/20  1/23/20  ...  3/17/20  3/18/20  3/19/20  3/20/20  3/21/20  3/22/20
99       New York             US  42.1657 -74.9481        0        0  ...     1706     2495     5365     8310    11710    15793

[1 rows x 65 columns]