wesm / pydata-book

Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media
Other
22.27k stars 15.2k forks source link

DataFrame constructor did not accept a list for the index keyword argument #88

Closed madenu closed 3 years ago

madenu commented 6 years ago

This is in the ch05.ipynb

Changed pd.DataFrame(pop, index=[2001, 2002, 2003]) to pd.DataFrame(pop, index=pd.Series([2001, 2002, 2003])) in order to get this cell to compile

frodo-x commented 6 years ago

I have the same problem. Run on Windows7 , Python 3.7.0 pandas 0.23.4

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-70-ea6d09f8324b> in <module>()
----> 1 DataFrame(pop, index=[2001, 2002, 2003])

D:\Anaconda\lib\site-packages\pandas\core\frame.py in __init__(self, data, index, columns, dtype, copy)
    346                                  dtype=dtype, copy=copy)
    347         elif isinstance(data, dict):
--> 348             mgr = self._init_dict(data, index, columns, dtype=dtype)
    349         elif isinstance(data, ma.MaskedArray):
    350             import numpy.ma.mrecords as mrecords

D:\Anaconda\lib\site-packages\pandas\core\frame.py in _init_dict(self, data, index, columns, dtype)
    457             arrays = [data[k] for k in keys]
    458 
--> 459         return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
    460 
    461     def _init_ndarray(self, values, index, columns, dtype=None, copy=False):

D:\Anaconda\lib\site-packages\pandas\core\frame.py in _arrays_to_mgr(arrays, arr_names, index, columns, dtype)
   7357 
   7358     # don't force copy because getting jammed in an ndarray anyway
-> 7359     arrays = _homogenize(arrays, index, dtype)
   7360 
   7361     # from BlockManager perspective

D:\Anaconda\lib\site-packages\pandas\core\frame.py in _homogenize(data, index, dtype)
   7659             if isinstance(v, dict):
   7660                 if oindex is None:
-> 7661                     oindex = index.astype('O')
   7662 
   7663                 if isinstance(index, (DatetimeIndex, TimedeltaIndex)):

AttributeError: 'list' object has no attribute 'astype'
wesm commented 6 years ago

This bug has been fixed in pandas development and will be in the next released version (either 0.23.5 or 0.24.0). Sorry about the inconvenience. You can convert the index argument to an ndarray for now index=np.array(...)

yau703 commented 5 years ago

i got this problem when i converted the index argument to an ndarray AttributeError: 'numpy.ndarray' object has no attribute 'values'

wesm commented 5 years ago

@Lamsaan can you provide more details? There might be another / different bug

yau703 commented 5 years ago

@Lamsaan can you provide more details? There might be another / different bug here's my code,is it a bug or or i made a stupid mistake pop = {'Nevada': {2001: 2.4, 2002: 2.9},'Ohio': {2000: 1.5, 2001: 1.7, 2002: 3.6}} pd.DataFrame(pop,index=np.array([2001,2002,2003]))

wesm commented 5 years ago
In [8]: pop = {'Nevada': {2001: 2.4, 2002: 2.9},'Ohio': {2000: 1.5, 2001: 1.7, 2002: 3.6}}; pd.
   ...: DataFrame(pop,index=np.array([2001,2002,2003]))                                        
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-8-363053f66eee> in <module>
----> 1 pop = {'Nevada': {2001: 2.4, 2002: 2.9},'Ohio': {2000: 1.5, 2001: 1.7, 2002: 3.6}}; pd.DataFrame(pop,index=np.array([2001,2002,2003]))

~/miniconda/envs/arrow-dev/lib/python3.6/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
    346                                  dtype=dtype, copy=copy)
    347         elif isinstance(data, dict):
--> 348             mgr = self._init_dict(data, index, columns, dtype=dtype)
    349         elif isinstance(data, ma.MaskedArray):
    350             import numpy.ma.mrecords as mrecords

~/miniconda/envs/arrow-dev/lib/python3.6/site-packages/pandas/core/frame.py in _init_dict(self, data, index, columns, dtype)
    457             arrays = [data[k] for k in keys]
    458 
--> 459         return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
    460 
    461     def _init_ndarray(self, values, index, columns, dtype=None, copy=False):

~/miniconda/envs/arrow-dev/lib/python3.6/site-packages/pandas/core/frame.py in _arrays_to_mgr(arrays, arr_names, index, columns, dtype)
   7357 
   7358     # don't force copy because getting jammed in an ndarray anyway
-> 7359     arrays = _homogenize(arrays, index, dtype)
   7360 
   7361     # from BlockManager perspective

~/miniconda/envs/arrow-dev/lib/python3.6/site-packages/pandas/core/frame.py in _homogenize(data, index, dtype)
   7665                 else:
   7666                     v = dict(v)
-> 7667                 v = lib.fast_multiget(v, oindex.values, default=np.nan)
   7668             v = _sanitize_array(v, index, dtype=dtype, copy=False,
   7669                                 raise_cast_failure=False)

AttributeError: 'numpy.ndarray' object has no attribute 'values'

@jreback @toobaz did this get fixed in 0.24.x? This failure is with 0.23.4

xochozomatli commented 5 years ago

If this hasn't been fixed, you can still use the index keyword if you pass it a Series instead of a list or ndarray.

kaushikSR commented 5 years ago

same issue still

jreback commented 5 years ago

this is fixed in 0.24.2