vaexio / vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
https://vaex.io
MIT License
8.22k stars 590 forks source link

[BUG-REPORT] Slicing an empty DF raises ValueError #2311

Open NickCrews opened 1 year ago

NickCrews commented 1 year ago
import vaex

df = vaex.from_arrays(x=[])
df[:5]

results in

---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
Cell In[83], line 2
      1 df = vaex.from_arrays(x=[])
----> 2 df[:5]

File ~/Library/Application Support/hatch/env/virtual/noatak-UM6-FHel/noatak/lib/python3.10/site-packages/vaex/dataframe.py:5428, in DataFrame.__getitem__(self, item)
   5426 if start >= stop:  # empty slice
   5427     df = self.trim()
-> 5428     df.set_active_range(start, max(start, stop))
   5429     return df.trim()
   5430 assert step in [None, 1]

File ~/Library/Application Support/hatch/env/virtual/noatak-UM6-FHel/noatak/lib/python3.10/site-packages/vaex/dataframe.py:4375, in DataFrame.set_active_range(self, i1, i2)
   4370 """Sets the active_fraction, set picked row to None, and remove selection.
   4371 
   4372 TODO: we may be able to keep the selection, if we keep the expression, and also the picked row
   4373 """
   4374 # logger.debug("set active range to: %r", (i1, i2))
-> 4375 self._active_fraction = (i2 - i1) / float(self.length_original())
   4376 # self._fraction_length = int(self._length * self._active_fraction)
   4377 self._index_start = i1

ZeroDivisionError: float division by zero

I would expect it to return the same, empty df