pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.72k stars 17.93k forks source link

df.plot() could cause "ValueError: year XXXXX is out of range" if matplotlib.dates.AutoDateLocator is used #18344

Open sakurai-youhei opened 6 years ago

sakurai-youhei commented 6 years ago

I encountered really wired problem around at combination of df.plot(), matplotlib.dates.AutoDateLocator and so on. For more details, please see below code that can reproduce the error everywhere as long as I checked.

Code to reproduce the error

from datetime import datetime
from matplotlib import pyplot as plt
from matplotlib.dates import AutoDateLocator
import pandas as pd

N = 2
index = pd.date_range(
    datetime.now().replace(second=0, microsecond=0),  # If second!=0, then the error disappears.
    periods=N,
    freq='60Min',  # If freq='H', then the error disappears.
)

df = pd.DataFrame({"A": range(N)}, index)
ax = df.plot()

# Unless AutoDateLocator is used, then the error disappears.
#   i.e.
#   - AutoDateLocator -> error
#   - MicrosecondLocator -> no error
#   - YearLocator -> no error
# Besides, it doesn't matter to which you set the locator, major or minor.
ax.xaxis.set_major_locator(AutoDateLocator())

print(df)
plt.show()

Stacktrace

Exception in Tkinter callback
Traceback (most recent call last):
  File "C:\Python36\lib\tkinter\__init__.py", line 1699, in __call__
    return self.func(*args)
  File "C:\Python36\lib\tkinter\__init__.py", line 745, in callit
    func(*args)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\backends\backend_tkagg.py", line 323, in idle_draw
    self.draw()
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\backends\backend_tkagg.py", line 304, in draw
    FigureCanvasAgg.draw(self)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\backends\backend_agg.py", line 430, in draw
    self.figure.draw(self.renderer)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\figure.py", line 1295, in draw
    renderer, self, artists, self.suppressComposite)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\image.py", line 138, in _draw_list_compositing_images
    a.draw(renderer)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\axes\_base.py", line 2399, in draw
    mimage._draw_list_compositing_images(renderer, self, artists)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\image.py", line 138, in _draw_list_compositing_images
    a.draw(renderer)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\axis.py", line 1133, in draw
    ticks_to_draw = self._update_ticks(renderer)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\axis.py", line 974, in _update_ticks
    tick_tups = list(self.iter_ticks())
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\axis.py", line 917, in iter_ticks
    majorLocs = self.major.locator()
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\dates.py", line 1054, in __call__
    self.refresh()
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\dates.py", line 1074, in refresh
    dmin, dmax = self.viewlim_to_dt()
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\dates.py", line 832, in viewlim_to_dt
    return num2date(vmin, self.tz), num2date(vmax, self.tz)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\dates.py", line 441, in num2date
    return _from_ordinalf(x, tz)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\dates.py", line 256, in _from_ordinalf
    dt = datetime.datetime.fromordinal(ix).replace(tzinfo=UTC)
ValueError: year 68948 is out of range
Exception in Tkinter callback
Traceback (most recent call last):
  File "C:\Python36\lib\tkinter\__init__.py", line 1699, in __call__
    return self.func(*args)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\backends\backend_tkagg.py", line 233, in resize
    self.show()
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\backends\backend_tkagg.py", line 304, in draw
    FigureCanvasAgg.draw(self)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\backends\backend_agg.py", line 430, in draw
    self.figure.draw(self.renderer)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\figure.py", line 1295, in draw
    renderer, self, artists, self.suppressComposite)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\image.py", line 138, in _draw_list_compositing_images
    a.draw(renderer)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\axes\_base.py", line 2399, in draw
    mimage._draw_list_compositing_images(renderer, self, artists)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\image.py", line 138, in _draw_list_compositing_images
    a.draw(renderer)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\axis.py", line 1133, in draw
    ticks_to_draw = self._update_ticks(renderer)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\axis.py", line 974, in _update_ticks
    tick_tups = list(self.iter_ticks())
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\axis.py", line 917, in iter_ticks
    majorLocs = self.major.locator()
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\dates.py", line 1054, in __call__
    self.refresh()
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\dates.py", line 1074, in refresh
    dmin, dmax = self.viewlim_to_dt()
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\dates.py", line 832, in viewlim_to_dt
    return num2date(vmin, self.tz), num2date(vmax, self.tz)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\dates.py", line 441, in num2date
    return _from_ordinalf(x, tz)
  File "C:\Users\sakurai\Desktop\ENV\lib\site-packages\matplotlib\dates.py", line 256, in _from_ordinalf
    dt = datetime.datetime.fromordinal(ix).replace(tzinfo=UTC)
ValueError: year 68948 is out of range

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.1.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 42 Stepping 7, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None pandas: 0.21.0 pytest: None pip: 9.0.1 setuptools: 28.8.0 Cython: None numpy: 1.13.3 scipy: None pyarrow: None xarray: None IPython: None sphinx: None patsy: None dateutil: 2.6.1 pytz: 2017.3 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 2.1.0 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None sqlalchemy: None pymysql: None psycopg2: None jinja2: None s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None

Windows + Python 3.6 + the newest libraries

(ENV) C:\Users\sakurai\Desktop>python -VV Python 3.6.1 (v3.6.1:69c0db5, Mar 21 2017, 18:41:36) [MSC v.1900 64 bit (AMD64)] (ENV) C:\Users\sakurai\Desktop>python -m pip freeze cycler==0.10.0 matplotlib==2.1.0 numpy==1.13.3 pandas==0.21.0 pyparsing==2.2.0 python-dateutil==2.6.1 pytz==2017.3 six==1.11.0

Windows + Python 3.5 + old libraries

C:\Users\sakurai\Desktop>C:\Python35\python.exe -c "import sys; print(sys.version)" 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:01:18) [MSC v.1900 32 bit (Intel)] C:\Users\sakurai\Desktop>C:\Python35\python.exe -m pip freeze | findstr "pandas matplotlib" matplotlib==1.5.3 pandas==0.19.1

Debian + Python 3.6 + the newest libraries

root@0bf58ce62933:/# python -VV Python 3.6.3 (default, Nov 4 2017, 22:17:09) [GCC 4.9.2] root@0bf58ce62933:/# python -m pip freeze cycler==0.10.0 matplotlib==2.1.0 numpy==1.13.3 pandas==0.21.0 pyparsing==2.2.0 python-dateutil==2.6.1 pytz==2017.3 six==1.11.0

Problem description

Empty window is popped up like this if the error occurs. image

Expected result

Graph should be drawn there at least...

TomAugspurger commented 6 years ago

So, IIUC the issue is with matplotlib's AutoDateLocator not understanding DatetimeIndexes, correct?

That's not too surprising, since they don't depend on pandas. We provide our own locators / formatters for our dtypes, which is why the default .plot should work.

sakurai-youhei commented 6 years ago

Yes, it is probably. But I don’t have clear idea of what’s root cause basically; Moreover, I’m not familiar with Pandas nor matplotlib.

Do you mean it’s not issue in Pandas but in my usage of non-Pandas locator with Pandas? Or is it rather problem in matplotlib?

Thanks.

sakurai-youhei commented 6 years ago

As I found https://github.com/pandas-dev/pandas/blob/c3cfe90d8b7be9435c59f279fc933c7931f0e215/pandas/plotting/_converter.py, it is probably caused by my fault. I’ll close this issue after playing locators there. Thanks.

TomAugspurger commented 6 years ago

Do you mean it’s not issue in Pandas but in my usage of non-Pandas locator with Pandas? Or is it rather problem in matplotlib?

Both :) I suspect matplotlib's AutoDateLocator doesn't know how to deal with pandas' datetimes. Though there has been some work there recently, so maybe try with matplotlib master: https://github.com/matplotlib/matplotlib/pull/9794/files

sakurai-youhei commented 6 years ago

Neither helps, PandasAutoDateLocator nor matplotlib from master branch...

from datetime import datetime
from matplotlib import pyplot as plt
import pandas as pd
from pandas.tseries.converter import PandasAutoDateLocator

N = 2
index = pd.date_range(
    datetime.now().replace(second=0, microsecond=0),
    periods=N,
    freq='60Min',
)

df = pd.DataFrame({"A": range(N)}, index)
ax = df.plot()
ax.xaxis.set_major_locator(PandasAutoDateLocator())

print(df)
plt.show()
Exception in Tkinter callback
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/tkinter/__init__.py", line 1699, in __call__
    return self.func(*args)
  File "/usr/local/lib/python3.6/tkinter/__init__.py", line 745, in callit
    func(*args)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/backends/backend_tkagg.py", line 324, in idle_draw
    self.draw()
  File "/usr/local/lib/python3.6/site-packages/matplotlib/backends/backend_tkagg.py", line 305, in draw
    FigureCanvasAgg.draw(self)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/backends/backend_agg.py", line 427, in draw
    self.figure.draw(self.renderer)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/artist.py", line 55, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/figure.py", line 1327, in draw
    renderer, self, artists, self.suppressComposite)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/image.py", line 138, in _draw_list_compositing_images
    a.draw(renderer)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/artist.py", line 55, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/axes/_base.py", line 2448, in draw
    mimage._draw_list_compositing_images(renderer, self, artists)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/image.py", line 138, in _draw_list_compositing_images
    a.draw(renderer)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/artist.py", line 55, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/axis.py", line 1134, in draw
    ticks_to_draw = self._update_ticks(renderer)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/axis.py", line 974, in _update_ticks
    tick_tups = list(self.iter_ticks())
  File "/usr/local/lib/python3.6/site-packages/matplotlib/axis.py", line 917, in iter_ticks
    majorLocs = self.major.locator()
  File "/usr/local/lib/python3.6/site-packages/matplotlib/dates.py", line 1088, in __call__
    self.refresh()
  File "/usr/local/lib/python3.6/site-packages/matplotlib/dates.py", line 1108, in refresh
    dmin, dmax = self.viewlim_to_dt()
  File "/usr/local/lib/python3.6/site-packages/matplotlib/dates.py", line 866, in viewlim_to_dt
    return num2date(vmin, self.tz), num2date(vmax, self.tz)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/dates.py", line 471, in num2date
    return _from_ordinalf(x, tz)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/dates.py", line 281, in _from_ordinalf
    dt = datetime.datetime.fromordinal(ix).replace(tzinfo=UTC)
ValueError: year 68953 is out of range
Exception in Tkinter callback
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/tkinter/__init__.py", line 1699, in __call__
    return self.func(*args)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/backends/backend_tkagg.py", line 234, in resize
    self.show()
  File "/usr/local/lib/python3.6/site-packages/matplotlib/backends/backend_tkagg.py", line 305, in draw
    FigureCanvasAgg.draw(self)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/backends/backend_agg.py", line 427, in draw
    self.figure.draw(self.renderer)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/artist.py", line 55, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/figure.py", line 1327, in draw
    renderer, self, artists, self.suppressComposite)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/image.py", line 138, in _draw_list_compositing_images
    a.draw(renderer)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/artist.py", line 55, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/axes/_base.py", line 2448, in draw
    mimage._draw_list_compositing_images(renderer, self, artists)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/image.py", line 138, in _draw_list_compositing_images
    a.draw(renderer)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/artist.py", line 55, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/axis.py", line 1134, in draw
    ticks_to_draw = self._update_ticks(renderer)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/axis.py", line 974, in _update_ticks
    tick_tups = list(self.iter_ticks())
  File "/usr/local/lib/python3.6/site-packages/matplotlib/axis.py", line 917, in iter_ticks
    majorLocs = self.major.locator()
  File "/usr/local/lib/python3.6/site-packages/matplotlib/dates.py", line 1088, in __call__
    self.refresh()
  File "/usr/local/lib/python3.6/site-packages/matplotlib/dates.py", line 1108, in refresh
    dmin, dmax = self.viewlim_to_dt()
  File "/usr/local/lib/python3.6/site-packages/matplotlib/dates.py", line 866, in viewlim_to_dt
    return num2date(vmin, self.tz), num2date(vmax, self.tz)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/dates.py", line 471, in num2date
    return _from_ordinalf(x, tz)
  File "/usr/local/lib/python3.6/site-packages/matplotlib/dates.py", line 281, in _from_ordinalf
    dt = datetime.datetime.fromordinal(ix).replace(tzinfo=UTC)
ValueError: year 68953 is out of range
root@6d08a38f34e5:/# python -VV
Python 3.6.3 (default, Nov  4 2017, 22:17:09)
[GCC 4.9.2]
root@6d08a38f34e5:/# python -m pip freeze
cycler==0.10.0
matplotlib==0+unknown
numpy==1.13.3
pandas==0.21.0
pyparsing==2.2.0
python-dateutil==2.6.1
pytz==2017.3
six==1.11.0
root@6d08a38f34e5:/# python -c "import matplotlib as mpl; print(mpl.__version__)"
0+unknown
root@6d08a38f34e5:/# python -c "import pandas as pd; pd.show_versions()"

INSTALLED VERSIONS
------------------
commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.49-moby
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.21.0
pytest: None
pip: 9.0.1
setuptools: 36.6.0
Cython: None
numpy: 1.13.3
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 0+unknown
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
TomAugspurger commented 6 years ago

Is there a problem with the default formatter picked by pandas?

You're welcome to dig into pandas/plotting/_converter.py to see what's going on.

sakurai-youhei commented 6 years ago

@TomAugspurger Thanks for your guidance. Actually, problem was there's no relevant information in the Internet and problem for me right now is default locators look slightly different from AutoXxxLocator. :) If I'll have found root, I will suggest something here or through pull-request. Thanks.

joshmalina commented 5 years ago

So this is not a fix, but if you call .to_timestamp() on the dataframe before you call plot, it converts it from a PeriodIndex to a DatetimeIndex, and then the issue is no longer there. It would be nice if this was fixed for PeriodIndex.