vaexio / vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
https://vaex.io
MIT License
8.23k stars 590 forks source link

[BUG-REPORT] Can't do binary ops between expressions from copied DFs #2324

Open NickCrews opened 1 year ago

NickCrews commented 1 year ago

Description

import vaex

df = vaex.from_arrays(x=[1, 2, 3])
df2 = df.copy()
df.x + df2.x

raises:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[5], line 5
      3 df = vaex.from_arrays(x=[1, 2, 3])
      4 df2 = df.copy()
----> 5 df.x + df2.x

File ~/Library/Application Support/hatch/env/virtual/noatak-UM6-FHel/noatak/lib/python3.10/site-packages/vaex/expression.py:144, in Meta.__new__.<locals>.wrap.<locals>.f(a, b)
    142 else:
    143     if isinstance(b, Expression):
--> 144         assert b.ds == a.ds
    145         b = b.expression
    146     elif isinstance(b, (np.timedelta64)):

AssertionError: 

I'm not sure if this is a bug. It seems to me as though it should be: Both df and df2 are from the same source of data, and that should be the important thing that should make this operation possible. But vaex complains that df and df2 aren't the exact same DataFrame. Am I missing something though?

Software information