pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.18k stars 17.77k forks source link

pandas.stats.interface.ols broken in 0.4 from 0.3 #102

Closed jdmarino closed 13 years ago

jdmarino commented 13 years ago

A previously working (python 2.6, pandas 0.3) program using pandas.stats.interface.ols now (python 2.7, pandas 0.4) throws an exception

code: WP = pandas.WidePanel.fromDict( AllPairSignals) AllPairRets = WP.minor_xs('PairRet') AllPairSigs = WP.minor_xs('Sig1') AllPairModel = pandas.stats.interface.ols( y=AllPairRets, x={'x':AllPairSigs})

very lengthy exception:

ERROR: An unexpected error occurred while tokenizing input The following traceback may be corrupted or invalid The error message is: ('EOF in multi-line statement', (8, 0))

ERROR: An unexpected error occurred while tokenizing input The following traceback may be corrupted or invalid The error message is: ('EOF in multi-line statement', (70, 0))

c:\python27\lib\site-packages\pandas\core\index.py(242)get_loc() 241 def get_loc(self, key): --> 242 return self.indexMap[key] 243

ipdb> quit

TypeError Traceback (most recent call last) c:\AlgoTrading\pairstrading\SignalRets.py in () 197 198 if name == "main": --> 199 main()

c:\AlgoTrading\pairstrading\SignalRets.py in main() 177 AllPairRets = WP.minor_xs('PairRet') 178 AllPairSigs = WP.minor_xs('Sig1') --> 179 AllPairModel = pandas.stats.interface.ols( y=AllPairRets, x={'x':AllPairSigs}) 180 PanelOLS[RegKey] = AllPairModel 181 print "PANEL REGRESSION", RegKey, "\n", AllPairModel

C:\Python27\lib\site-packages\pandas\stats\interface.pyc in ols(kwargs) 115 klass = MovingPanelOLS 116 --> 117 return klass(kwargs)

C:\Python27\lib\site-packages\pandas\stats\plm.pyc in init(self, y, x, weights, intercept, nw_lags, entity_effects, time_effects, x_effects, cluster, dropped_dummies, verbose, nw_overlap) 85 (self._x, self._x_trans, 86 self._x_filtered, self._y, ---> 87 self._y_trans) = self._prepare_data() 88 89 self._x_trans_raw = self._x_trans.values

C:\Python27\lib\site-packages\pandas\stats\plm.pyc in _prepare_data(self) 111 """ 112 (x, x_filtered, y, weights, --> 113 weights_filt, cat_mapping) = self._filter_data() 114 115 self.log('Adding dummies to X variables')

C:\Python27\lib\site-packages\pandas\stats\plm.pyc in _filter_data(self) 183 184 x = data_long.filter(x_names) --> 185 y = data_long.ix[:, ['y']] 186 187 if self._weights:

C:\Python27\lib\site-packages\pandas\core\indexing.pyc in getitem(self, key) 96 return self._fancy_getitem_axis(key, axis=0) 97 elif isinstance(key, tuple): ---> 98 return self._getitem_tuple(key) 99 elif _is_list_like(key): 100 return self._fancy_getitem(key, axis=0)

C:\Python27\lib\site-packages\pandas\core\indexing.pyc in _getitem_tuple(self, key) 105 if isinstance(self.frame.index, MultiIndex): 106 try: --> 107 return self.frame.xs(key) 108 except KeyError: 109 # could do something more intelligent here? like raising the

C:\Python27\lib\site-packages\pandas\core\frame.pyc in xs(self, key, copy) 924 925 self._consolidate_inplace() --> 926 new_data = self._data.xs(key, axis=1, copy=copy) 927 if new_data.ndim == 1: 928 return Series(new_data.as_matrix(), index=self.columns)

C:\Python27\lib\site-packages\pandas\core\internals.pyc in xs(self, key, axis, copy) 430 assert(axis >= 1) 431 --> 432 loc = self.axes[axis].getloc(key) 433 slicer = [slice(None, None) for in range(self.ndim)] 434 slicer[axis] = loc

C:\Python27\lib\site-packages\pandas\core\index.pyc in get_loc(self, key) 624 if isinstance(key, tuple): 625 if len(key) == self.nlevels: --> 626 return self._get_tuple_loc(key) 627 else: 628 result = slice(*self.slice_locs(key, key))

C:\Python27\lib\site-packages\pandas\core\index.pyc in _get_tuple_loc(self, tup) 642 643 def _get_tuple_loc(self, tup): --> 644 indexer = self._get_label_key(tup) 645 try: 646 return self.indexMap[indexer]

C:\Python27\lib\site-packages\pandas\core\index.pyc in _get_label_key(self, tup) 649 650 def _get_label_key(self, tup): --> 651 return tuple(lev.get_loc(v) for lev, v in zip(self.levels, tup)) 652 653 def truncate(self, before=None, after=None):

C:\Python27\lib\site-packages\pandas\core\index.pyc in ((lev, v)) 649 650 def _get_label_key(self, tup): --> 651 return tuple(lev.get_loc(v) for lev, v in zip(self.levels, tup)) 652 653 def truncate(self, before=None, after=None):

C:\Python27\lib\site-packages\pandas\core\index.pyc in get_loc(self, key) 240 241 def get_loc(self, key): --> 242 return self.indexMap[key] 243 244 def get_indexer(self, target, method=None):

TypeError: unhashable type

Here's a dump of what is in the 2 DataFrames. (The column name is the str of a tuple.) ipdb> AllPairSigs ('CSX', 'NSC') 2011-04-01 -0.6202 2011-04-04 1.433 2011-04-05 -0.8794 2011-04-06 0.9277 2011-04-07 -0.2481 2011-04-08 0.3348 2011-04-11 0.9487 2011-04-12 1.805 2011-04-13 2.779 2011-04-14 -4.287 2011-04-15 0.5962 2011-04-18 1.056 2011-04-19 0.166 2011-04-20 6.21 2011-04-21 0.2945 2011-04-22 -1.267 2011-04-25 0.9863 2011-04-26 1.742 2011-04-27 1.392 2011-04-28 7.962 2011-04-29 3.862 2011-05-02 -1.323 2011-05-03 0.6395 2011-05-04 -1.665 2011-05-05 -2.061 2011-05-06 -0.729 2011-05-09 -0.2925 2011-05-10 2.134

ipdb> AllPairRets ('CSX', 'NSC') 2011-04-01 0.008226 2011-04-04 0.007131 2011-04-05 -0.005219 2011-04-06 0.006201 2011-04-07 0.008445 2011-04-08 0.009267 2011-04-11 0.005058 2011-04-12 -0.007568 2011-04-13 0.008001 2011-04-14 -0.00607 2011-04-15 -0.009921 2011-04-18 0.006624 2011-04-19 0.006538 2011-04-20 0.01625 2011-04-21 0.01686 2011-04-22 0 2011-04-25 -0.008953 2011-04-26 -0.01038 2011-04-27 0.009355 2011-04-28 -0.02005 2011-04-29 0.007567 2011-05-02 -0.008787 2011-05-03 -0.009324 2011-05-04 0.00906 2011-05-05 0.01203 2011-05-06 -0.005018 2011-05-09 -0.008835 2011-05-10 -0.008339

wesm commented 13 years ago

This got fixed yesterday evening-- I think I built you a binary while I was in the middle of making some other changes and hadn't run the unit test suite...sorry =/ I'll reply off-list with a new