Open emmanuelle opened 4 years ago
For example
In [50]: df[df['Province/State'].str[-2:] == 'NY']
Out[50]:
Province/State Country/Region Lat Long 1/22/20 ... 3/18/20 3/19/20 3/20/20 3/21/20 3/22/20
274 Suffolk County, NY US 40.9849 -72.6151 0 ... 0 0 0 0 0
275 Ulster County, NY US 41.8586 -74.3118 0 ... 0 0 0 0 0
285 Rockland County, NY US 41.1489 -73.9830 0 ... 0 0 0 0 0
286 Saratoga County, NY US 43.0324 -73.9360 0 ... 0 0 0 0 0
308 Nassau County, NY US 40.6546 -73.5594 0 ... 0 0 0 0 0
319 New York County, NY US 40.7128 -74.0060 0 ... 0 0 0 0 0
332 Westchester County, NY US 41.1220 -73.7949 0 ... 0 0 0 0 0
[7 rows x 65 columns]
In [51]: df[df['Province/State'] == 'New York']
Out[51]:
Province/State Country/Region Lat Long 1/22/20 1/23/20 ... 3/17/20 3/18/20 3/19/20 3/20/20 3/21/20 3/22/20
99 New York US 42.1657 -74.9481 0 0 ... 1706 2495 5365 8310 11710 15793
[1 rows x 65 columns]
The Johns Hopkins dataset has information at state or county level in the State / Province column. For some states (eg California) there is information both for the state and for some counties of this state (altough most of the time the numbers are 0 at county level). It is not clear how the group by should be performed in this case.