quantopian / alphalens

Performance analysis of predictive (alpha) stock factors
http://quantopian.github.io/alphalens
Apache License 2.0
3.33k stars 1.14k forks source link

error occurred in event_study.jpynb in create_event_study_tear_sheet() #223

Closed KojiIzawa closed 6 years ago

KojiIzawa commented 6 years ago

I'm trying to the (example)[alphalens/examples/event_study.ipynb]. But error occuerred. Could somebody tell me how to fix? After alphalens.tears.create_event_study_tear_sheet(factor_data, pricing, avgretplot=(5, 10)).

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
pandas/index.pyx in pandas.index.DatetimeEngine.get_loc (pandas/index.c:10990)()

pandas/hashtable.pyx in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:6589)()

TypeError: an integer is required

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/pandas/indexes/base.py in get_loc(self, key, method, tolerance)
   1944             try:
-> 1945                 return self._engine.get_loc(key)
   1946             except KeyError:

pandas/index.pyx in pandas.index.DatetimeEngine.get_loc (pandas/index.c:11140)()

pandas/index.pyx in pandas.index.DatetimeEngine.get_loc (pandas/index.c:11046)()

pandas/index.pyx in pandas.index.DatetimeEngine._date_check_type (pandas/index.c:11210)()

KeyError: 'min'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
pandas/index.pyx in pandas.index.DatetimeEngine.get_loc (pandas/index.c:10990)()

pandas/hashtable.pyx in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:6589)()

TypeError: an integer is required

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/pandas/tseries/index.py in get_loc(self, key, method, tolerance)
   1430         try:
-> 1431             return Index.get_loc(self, key, method, tolerance)
   1432         except (KeyError, ValueError, TypeError):

~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/pandas/indexes/base.py in get_loc(self, key, method, tolerance)
   1946             except KeyError:
-> 1947                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   1948 

pandas/index.pyx in pandas.index.DatetimeEngine.get_loc (pandas/index.c:11140)()

pandas/index.pyx in pandas.index.DatetimeEngine.get_loc (pandas/index.c:11046)()

pandas/index.pyx in pandas.index.DatetimeEngine._date_check_type (pandas/index.c:11210)()

KeyError: 'min'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
ValueError: Error parsing datetime string "min" at position 0

The above exception was the direct cause of the following exception:

SystemError                               Traceback (most recent call last)
<ipython-input-12-6b9186ed2815> in <module>()
----> 1 alphalens.tears.create_event_study_tear_sheet(factor_data, pricing, avgretplot=(5, 10))

~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/alphalens/plotting.py in call_w_context(*args, **kwargs)
     41             with plotting_context(), axes_style():
     42                 sns.despine(left=True)
---> 43                 return func(*args, **kwargs)
     44         else:
     45             return func(*args, **kwargs)

~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/alphalens/tears.py in create_event_study_tear_sheet(factor_data, prices, avgretplot)
    654     long_short = False
    655 
--> 656     plotting.plot_quantile_statistics_table(factor_data)
    657 
    658     gf = GridFigure(rows=1, cols=1)

~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/alphalens/plotting.py in plot_quantile_statistics_table(factor_data)
    180 def plot_quantile_statistics_table(factor_data):
    181     quantile_stats = factor_data.groupby('factor_quantile') \
--> 182         .agg(['min', 'max', 'mean', 'std', 'count'])['factor']
    183     quantile_stats['count %'] = quantile_stats['count'] \
    184         / quantile_stats['count'].sum() * 100.

~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/pandas/core/groupby.py in aggregate(self, arg, *args, **kwargs)
   3595     @Appender(SelectionMixin._agg_doc)
   3596     def aggregate(self, arg, *args, **kwargs):
-> 3597         return super(DataFrameGroupBy, self).aggregate(arg, *args, **kwargs)
   3598 
   3599     agg = aggregate

~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/pandas/core/groupby.py in aggregate(self, arg, *args, **kwargs)
   3112 
   3113         _level = kwargs.pop('_level', None)
-> 3114         result, how = self._aggregate(arg, _level=_level, *args, **kwargs)
   3115         if how is None:
   3116             return result

~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/pandas/core/base.py in _aggregate(self, arg, *args, **kwargs)
    562             return result, True
    563         elif hasattr(arg, '__iter__'):
--> 564             return self._aggregate_multiple_funcs(arg, _level=_level), None
    565         else:
    566             result = None

~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/pandas/core/base.py in _aggregate_multiple_funcs(self, arg, _level)
    607                 try:
    608                     colg = self._gotitem(col, ndim=1, subset=obj[col])
--> 609                     results.append(colg.aggregate(arg))
    610                     keys.append(col)
    611                 except (TypeError, DataError):

~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/pandas/core/groupby.py in aggregate(self, func_or_funcs, *args, **kwargs)
   2572         if hasattr(func_or_funcs, '__iter__'):
   2573             ret = self._aggregate_multiple_funcs(func_or_funcs,
-> 2574                                                  (_level or 0) + 1)
   2575         else:
   2576             cyfunc = self._is_cython_func(func_or_funcs)

~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/pandas/core/groupby.py in _aggregate_multiple_funcs(self, arg, _level)
   2630             # reset the cache so that we
   2631             # only include the named selection
-> 2632             if name in self._selected_obj:
   2633                 obj = copy.copy(obj)
   2634                 obj._reset_cache()

~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/pandas/core/generic.py in __contains__(self, key)
    844     def __contains__(self, key):
    845         """True if the key is in the info axis"""
--> 846         return key in self._info_axis
    847 
    848     @property

~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/pandas/indexes/multi.py in __contains__(self, key)
    951         # work around some kind of odd cython bug
    952         try:
--> 953             self.get_loc(key)
    954             return True
    955         except LookupError:

~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/pandas/indexes/multi.py in get_loc(self, key, method)
   1547 
   1548         if not isinstance(key, tuple):
-> 1549             loc = self._get_level_indexer(key, level=0)
   1550             return _maybe_to_slice(loc)
   1551 

~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/pandas/indexes/multi.py in _get_level_indexer(self, key, level, indexer)
   1808         else:
   1809 
-> 1810             loc = level_index.get_loc(key)
   1811             if level > 0 or self.lexsort_depth == 0:
   1812                 return np.array(labels == loc, dtype=bool)

~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/pandas/tseries/index.py in get_loc(self, key, method, tolerance)
   1437 
   1438             try:
-> 1439                 stamp = Timestamp(key, tz=self.tz)
   1440                 return Index.get_loc(self, stamp, method, tolerance)
   1441             except (KeyError, ValueError):

pandas/tslib.pyx in pandas.tslib.Timestamp.__new__ (pandas/tslib.c:9203)()

pandas/tslib.pyx in pandas.tslib.convert_to_tsobject (pandas/tslib.c:24653)()

pandas/tslib.pyx in pandas.tslib.convert_str_to_tsobject (pandas/tslib.c:26273)()

pandas/src/datetime.pxd in datetime._string_to_dts (pandas/tslib.c:85631)()

SystemError: <class 'str'> returned a result with an error set

<matplotlib.figure.Figure at 0x7f4f246aa518>

Thanks in advance.

luca-s commented 6 years ago
KojiIzawa commented 6 years ago

@luca-s Thank you very much! My Pandas was a bit older one, 0.18.1. After update to current 0.21.0, this sample worked well.

luca-s commented 6 years ago

I am glad it worked with new pandas, but Alphalens is suppose to work even with pandas 0.18.1, that's very strange. I am closing this for now, but I will test on my side to see if I can reproduce the bug with pandas 0.18.1 and in case I'll reopen the issue