quantopian / zipline

Zipline, a Pythonic Algorithmic Trading Library
https://www.zipline.io
Apache License 2.0
17.58k stars 4.72k forks source link

zipline fails to load price data for 1962 year #1540

Open bartosh opened 7 years ago

bartosh commented 7 years ago

Dear Zipline Maintainers,

Before I tell you about my issue, let me describe my environment:

Environment

Now that you know a little about me, let me tell you about the issue I am having:

Description of Issue

Zipline crashed when I tried to ingest this price data: date,open,high,low,close,volume 1962-01-02,65.56,65.75,65.38,65.38,5600.0 1962-01-03,65.38,66.37,65.25,66.37,7467.0 1962-01-04,66.37,66.88,66.37,66.37,8067.0 1962-01-05,66.37,66.75,66.13,66.25,7067.0 1962-01-08,66.0,66.0,63.5,64.0,9400.0 1962-01-09,64.0,65.0,63.75,63.75,6467.0 1962-01-10,63.75,64.0,63.5,63.88,3467.0 1962-01-11,63.62,63.62,63.25,63.5,2800.0 1962-01-12,63.5,63.62,62.0,62.0,2667.0 1962-01-15,61.5,61.5,60.75,60.75,6467.0

I used my custom bundle loader custom-csvdir. You can see its code here: https://github.com/bartosh/zipline/blob/master/zipline/data/bundles/csvdir.py

Here is how I used it:

Here are some debug output and a traceback:

table: 0 <class 'bcolz.ctable.ctable'> [4042592896 4042679296 4042765696 4042852096 4043111296 4043197696 4043284096 4043370496 4043456896 4043716096] asset_first_day= 4042592896 idx= 7003 market_closes_nanos= [ 631314000000000000 631400400000000000 631486800000000000 ..., 1507752000000000000 1507838400000000000 1507924800000000000] dt= 4042592896000000000 False schedule.index= DatetimeIndex(['1990-01-02', '1990-01-03', '1990-01-04', '1990-01-05', '1990-01-08', '1990-01-09', '1990-01-10', '1990-01-11', '1990-01-12', '1990-01-15', ... '2017-10-02', '2017-10-03', '2017-10-04', '2017-10-05', '2017-10-06', '2017-10-09', '2017-10-10', '2017-10-11', '2017-10-12', '2017-10-13'], dtype='datetime64[ns, UTC]', length=7003, freq='C')

Traceback (most recent call last): File "~/.virtualenvs/zipline/bin/zipline", line 11, in load_entry_point('zipline==1.0.2', 'console_scripts', 'zipline')() File "~/.virtualenvs/zipline/lib/python2.7/site-packages/click-6.6-py2.7.egg/click/core.py", line 716, in call return self.main(_args, _kwargs) File "~/.virtualenvs/zipline/lib/python2.7/site-packages/click-6.6-py2.7.egg/click/core.py", line 696, in main rv = self.invoke(ctx) File "~/.virtualenvs/zipline/lib/python2.7/site-packages/click-6.6-py2.7.egg/click/core.py", line 1060, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "~/.virtualenvs/zipline/lib/python2.7/site-packages/click-6.6-py2.7.egg/click/core.py", line 889, in invoke return ctx.invoke(self.callback, _ctx.params) File "~/.virtualenvs/zipline/lib/python2.7/site-packages/click-6.6-py2.7.egg/click/core.py", line 534, in invoke return callback(_args, **kwargs) File "~/.virtualenvs/zipline/lib/python2.7/site-packages/zipline-1.0.2+162.ga6bfb19.dirty-py2.7-macosx-10.11-intel.egg/zipline/main.py", line 306, in ingest show_progress, File "~/.virtualenvs/zipline/lib/python2.7/site-packages/zipline-1.0.2+162.ga6bfb19.dirty-py2.7-macosx-10.11-intel.egg/zipline/data/bundles/core.py", line 451, in ingest pth.data_path([name, timestr], environ=environ), File "~/.virtualenvs/zipline/lib/python2.7/site-packages/zipline-1.0.2+162.ga6bfb19.dirty-py2.7-macosx-10.11-intel.egg/zipline/data/bundles/csvdir.py", line 94, in ingest writer.write(_pricing_iter(), show_progress=show_progress) File "~/.virtualenvs/zipline/lib/python2.7/site-packages/zipline-1.0.2+162.ga6bfb19.dirty-py2.7-macosx-10.11-intel.egg/zipline/data/us_equity_pricing.py", line 265, in write return self._write_internal(it, assets) File "~/.virtualenvs/zipline/lib/python2.7/site-packages/zipline-1.0.2+162.ga6bfb19.dirty-py2.7-macosx-10.11-intel.egg/zipline/data/us_equity_pricing.py", line 364, in _write_internal Timestamp(asset_first_day, unit='s', tz='UTC') File "~/.virtualenvs/zipline/lib/python2.7/site-packages/zipline-1.0.2+162.ga6bfb19.dirty-py2.7-macosx-10.11-intel.egg/zipline/utils/calendars/trading_calendar.py", line 699, in minute_to_session_label Given a minute, get the label of its containing session. File "~/.virtualenvs/zipline/lib/python2.7/site-packages/zipline-1.0.2+162.ga6bfb19.dirty-py2.7-macosx-10.11-intel.egg/zipline/utils/calendars/trading_calendar.py", line 733, in minute_to_session_label current_or_next_session = self.schedule.index[idx] File "~/.virtualenvs/zipline/lib/python2.7/site-packages/pandas-0.18.1-py2.7-macosx-10.11-intel.egg/pandas/tseries/base.py", line 192, in getitem val = getitem(key) IndexError: index 7003 is out of bounds for axis 0 with size 7003

I've added a bit of debugging output to the code: --- a/zipline/data/us_equity_pricing.py +++ b/zipline/data/us_equity_pricing.py @@ -325,6 +325,7 @@ class BcolzDailyBarWriter(object): yield asset_id, table

     for asset_id, table in iterator:

diff --git a/zipline/utils/calendars/trading_calendar.py b/zipline/utils/calendars/trading_calendar.py index 0197ece..f526bdf 100644 --- a/zipline/utils/calendars/trading_calendar.py +++ b/zipline/utils/calendars/trading_calendar.py @@ -725,6 +725,11 @@ class TradingCalendar(with_metaclass(ABCMeta)): pass

     idx = searchsorted(self.market_closes_nanos, dt)

If you need more info please let me know.

Regards, Ed

bartosh commented 7 years ago

Looks like zipline can't load anything before start of 1970, i.e. for any date that produces negative unix timestamp I guess.