Closed chaltik closed 4 years ago
Hi,
This is actually the same issue as in #608
So in your example, if you "name" the column in the pandas dataframe, things should go smoothly:
test_df = pd.DataFrame(np.random.randn(10000000,10), columns=[f'col{i}' for i in range(10)])
df = vaex.from_pandas(test_df)
df
thanks. this came out trying to read a 2d numpy array into vaex and not finding a direct way to do it :) (from_arrays only take 1d arrays)
Hi,
This is actually the same issue as in #608
So in your example, if you "name" the column in the pandas dataframe, things should go smoothly:
test_df = pd.DataFrame(np.random.randn(10000000,10), columns=[f'col{i}' for i in range(10)]) df = vaex.from_pandas(test_df) df
Hi,
I think this is related to the version of pandas being used. I'm having the same issue within a project were I'm forced to use pandas version 0.23.4, and the solution you propose doesn't work neither, and fails with the same error. I execute exactly the code you use as example and I still get the error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-55-cf6ccc83be51> in <module>
1 test_df = pd.DataFrame(np.random.randn(10000000,10), columns=[f'col{i}' for i in range(10)])
----> 2 df = vaex.from_pandas(test_df)
3 df
/data/dataiku/dss_data/code-envs/python/py36_weather/lib/python3.6/site-packages/vaex/__init__.py in from_pandas(df, name, copy_index, index_name)
400 print("Giving up column %s, error: %r" % (name, e))
401 for name in df.columns:
--> 402 add(name, df[name])
403 if copy_index:
404 add(index_name, df.index)
/data/dataiku/dss_data/code-envs/python/py36_weather/lib/python3.6/site-packages/vaex/__init__.py in add(name, column)
388 def add(name, column):
389 values = column.values
--> 390 if isinstance(values, pd.core.arrays.integer.IntegerArray):
391 values = np.ma.array(values._data, mask=values._mask)
392 try:
AttributeError: module 'pandas.core.arrays' has no attribute 'integer'
And in fact pandas 0.23.4 module 'pandas.core.arrays' doesn't have the attribute integer yet implemented.
Yeah, I think we actually require pandas 0.24. We should change that in our requirements or fix this.
I think a workaround for now would be to copy/paste https://github.com/vaexio/vaex/blob/d7c32e046dd3da6eaf773221bd74bdeed2127ab2/packages/vaex-core/vaex/__init__.py#L371
and take out https://github.com/vaexio/vaex/blob/d7c32e046dd3da6eaf773221bd74bdeed2127ab2/packages/vaex-core/vaex/__init__.py#L390
`(py3.6-tsse) wtisim@ip-10-0-0-38:~/wtisim$ ipython Python 3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 17:14:51) Type 'copyright', 'credits' or 'license' for more information IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import vaex
In [2]: import pandas as pd
In [3]: import numpy as np
In [4]: test_df = pd.DataFrame(np.random.randn(10000000,10))
In [5]: vaex_df = vaex.from_pandas(test_df)
AttributeError Traceback (most recent call last)