vaexio / vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
https://vaex.io
MIT License
8.31k stars 591 forks source link

[BUG-REPORT]vaex.concat or dataframe.concat with one-row dataframe failed #1008

Closed tanruoqiao closed 3 years ago

tanruoqiao commented 4 years ago

Description vaex.concat([ori_df,add_df]) and ori_df.concat(add_df) when add_df contains only one row failed. If I let add_df contains 2 rows, there would be no error.

>>ori_df
#    sec_int    trade_date    open    high    low    close    pre_close    change    pct_chg    vol        amount      market_int
0    605399     20200804      15.79   18.95   15.79  18.95    13.16        5.79      43.997     1325.0     2500.131    1
1    605399     20200805      20.85   20.85   20.85  20.85    18.95        1.9       10.0264    336.29     701.165     1
2    605399     20200806      22.94   22.94   22.94  22.94    20.85        2.09      10.024     720.02     1651.726    1
>>add_df
#    sec_int    trade_date    open    high    low    close    pre_close    change    pct_chg      vol    amount    market_int
0     605399    2.0201e+07   19.43   19.66  19.29     19.6        19.51      0.09     0.4613  19109.9   37245.5   1
>>ori_df.concat(add_df)
Traceback (most recent call last):
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\IPython\core\formatters.py", line 224, in catch_format_error
r = method(self, *args, **kwargs)
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\IPython\core\formatters.py", line 970, in __call__
return method(include=include, exclude=exclude)
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3617, in _repr_mimebundle_
return {'text/html':self._head_and_tail_table(format='html'), 'text/plain': self._head_and_tail_table(format='plain')}
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3412, in _head_and_tail_table
return self._as_table(0, n, N - n, N, format=format)
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3550, in _as_table
table_part(j1, j2, parts)
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3520, in table_part
df = self[k1:k2]
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 4556, in __getitem__
return df.trim()
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3796, in trim
df.columns[name] = column.trim(self._index_start, self._index_end)
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\column.py", line 195, in trim
expressions.append(self.expressions[i])
IndexError: list index out of range
Traceback (most recent call last):
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\IPython\core\formatters.py", line 224, in catch_format_error
r = method(self, *args, **kwargs)
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\IPython\core\formatters.py", line 702, in __call__
printer.pretty(obj)
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\IPython\lib\pretty.py", line 402, in pretty
return _repr_pprint(obj, self, cycle)
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\IPython\lib\pretty.py", line 697, in _repr_pprint
output = repr(obj)
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3628, in __repr__
return self._head_and_tail_table(format='plain')
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3412, in _head_and_tail_table
return self._as_table(0, n, N - n, N, format=format)
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3550, in _as_table
table_part(j1, j2, parts)
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3520, in table_part
df = self[k1:k2]
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 4556, in __getitem__
return df.trim()
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3796, in trim
df.columns[name] = column.trim(self._index_start, self._index_end)
File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\column.py", line 195, in trim
expressions.append(self.expressions[i])
IndexError: list index out of range

Software information

Additional information Please state any supplementary information or provide additional context for the problem (e.g. screenshots, data, etc..).

JovanVeljanoski commented 4 years ago

Hi,

Can you please try to update vaex? You seem to be running a very old version.

You can check the release notes on GitHub to see which one is the latest.

Cheers

On Thu, Oct 15, 2020, 08:00 roguet notifications@github.com wrote:

Description ori_vxdf

ori_df

sec_int trade_date open high low close pre_close change pct_chg vol amount market_int

0 605399 20200804 15.79 18.95 15.79 18.95 13.16 5.79 43.997 1325.0 2500.131 1 1 605399 20200805 20.85 20.85 20.85 20.85 18.95 1.9 10.0264 336.29 701.165 1 2 605399 20200806 22.94 22.94 22.94 22.94 20.85 2.09 10.024 720.02 1651.726 1 add_df

sec_int trade_date open high low close pre_close change pct_chg vol amount market_int

0 605399 2.0201e+07 19.43 19.66 19.29 19.6 19.51 0.09 0.4613 19109.9 37245.5 ori_df.concat(add_df) Traceback (most recent call last): File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\IPython\core\formatters.py", line 224, in catch_format_error r = method(self, *args, *kwargs) File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\IPython\core\formatters.py", line 970, in call return method(include=include, exclude=exclude) File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3617, in _reprmimebundle return {'text/html':self._head_and_tail_table(format='html'), 'text/plain': self._head_and_tail_table(format='plain')} File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3412, in _head_and_tail_table return self._as_table(0, n, N - n, N, format=format) File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3550, in _as_table table_part(j1, j2, parts) File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3520, in table_part df = self[k1:k2] File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 4556, in getitem return df.trim() File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3796, in trim df.columns[name] = column.trim(self._index_start, self._index_end) File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\column.py", line 195, in trim expressions.append(self.expressions[i]) IndexError: list index out of range Traceback (most recent call last): File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\IPython\core\formatters.py", line 224, in catch_format_error r = method(self, args, **kwargs) File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\IPython\core\formatters.py", line 702, in call printer.pretty(obj) File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\IPython\lib\pretty.py", line 402, in pretty return _repr_pprint(obj, self, cycle) File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\IPython\lib\pretty.py", line 697, in _repr_pprint output = repr(obj) File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3628, in repr return self._head_and_tail_table(format='plain') File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3412, in _head_and_tail_table return self._as_table(0, n, N - n, N, format=format) File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3550, in _as_table table_part(j1, j2, parts) File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3520, in table_part df = self[k1:k2] File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 4556, in getitem return df.trim() File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\dataframe.py", line 3796, in trim df.columns[name] = column.trim(self._index_start, self._index_end) File "C:\Users\tanru\Anaconda3\envs\trading_env\lib\site-packages\vaex\column.py", line 195, in trim expressions.append(self.expressions[i]) IndexError: list index out of range

Software information

  • Vaex version (import vaex; vaex.version) '1.0.0-beta.6':
  • Vaex was installed via: conda-forge
  • OS: windows 10

Additional information Please state any supplementary information or provide additional context for the problem (e.g. screenshots, data, etc..).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/vaexio/vaex/issues/1008, or unsubscribe https://github.com/notifications/unsubscribe-auth/AENW4Z5EZGK2PMSE2HB3MP3SK2FW7ANCNFSM4SRQQISQ .

JovanVeljanoski commented 3 years ago

Closing this as stale. Please re-open if needed.