JeroenKools / covid19

Data visualizations of the spread of the 2019 novel Coronavirus (COVID-19), based on data from Johns Hopkins University.
MIT License
10 stars 5 forks source link

KeyError: 'the label [Mainland China] is not in the [index]' #1

Closed setaur closed 4 years ago

setaur commented 4 years ago

I'm trying to run your code, so I can later modify countries list and plot more of that great charts. I know only some basics of python. I converted that ipynb file to py: jupyter-nbconvert --to python COVID-19.ipynb I got COVID-19.py file, which I run, but got error: File "./COVID-19.py", line 48 plt.gca().get_yaxis().set_major_formatter(matplotlib.ticker.FuncFormatter(lambda x, p: f"{x:,.0f}")) ^ SyntaxError: invalid syntax

I tried changing shebang from #!/usr/bin/env python to #!/usr/bin/env python3, then I got:

Province/State        Country/Region      Lat   ...     3/9/20  3/10/20  3/11/20
0                NaN              Thailand  15.0000   ...         50       53       59
1                NaN                 Japan  36.0000   ...        511      581      639
2                NaN             Singapore   1.2833   ...        150      160      178
3                NaN                 Nepal  28.1667   ...          1        1        1
4                NaN              Malaysia   2.5000   ...        117      129      149
5   British Columbia                Canada  49.2827   ...         32       32       39
6    New South Wales             Australia -33.8688   ...         48       55       65
7           Victoria             Australia -37.8136   ...         15       18       21
8         Queensland             Australia -28.0167   ...         15       18       20
9                NaN              Cambodia  11.5500   ...          2        2        3
10               NaN             Sri Lanka   7.0000   ...          1        1        2
11               NaN               Germany  51.0000   ...       1176     1457     1908
12               NaN               Finland  64.0000   ...         30       40       59
13               NaN  United Arab Emirates  24.0000   ...         45       74       74
14               NaN           Philippines  13.0000   ...         20       33       49

[15 rows x 54 columns]
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/pandas/core/indexing.py", line 1790, in _validate_key
error()
File "/usr/lib/python3/dist-packages/pandas/core/indexing.py", line 1785, in error
axis=self.obj._get_axis_name(axis)))
KeyError: 'the label [Mainland China] is not in the [index]'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./COVID-19.bak.py", line 77, in <module>
by_country.loc["All except China", dates] =        by_country.sum().loc[dates]-by_country.loc["Mainland China", dates]   # Add "Outside China" row
File "/usr/lib/python3/dist-packages/pandas/core/indexing.py", line 1472, in __getitem__
return self._getitem_tuple(key)
File "/usr/lib/python3/dist-packages/pandas/core/indexing.py", line 870, in _getitem_tuple
return self._getitem_lowerdim(tup)
File "/usr/lib/python3/dist-packages/pandas/core/indexing.py", line 998, in _getitem_lowerdim
section = self._getitem_axis(key, axis=i)
File "/usr/lib/python3/dist-packages/pandas/core/indexing.py", line 1911, in _getitem_axis
self._validate_key(key, axis)
File "/usr/lib/python3/dist-packages/pandas/core/indexing.py", line 1798, in _validate_key
error()
File "/usr/lib/python3/dist-packages/pandas/core/indexing.py", line 1785, in error
axis=self.obj._get_axis_name(axis)))
KeyError: 'the label [Mainland China] is not in the [index]'

I'm out of ideas. Here is full converted script: https://pastebin.com/raw/HrcgUtqZ

JeroenKools commented 4 years ago

@setaur: Thanks for reporting; I'm aware of the issue and have a fix. The Mainland China KeyErorr is due to a change in the format of the raw data reported by Johns Hopkins. Fix will be pushed shortly :)

As for the other issue you were having, you're correct that it's a Python version thing.

JeroenKools commented 4 years ago

Fixed in the latest commit. Although there are currently some data quality issues in the data repo.