scikit-hep / awkward

Manipulate JSON-like data with NumPy-like idioms.
https://awkward-array.org
BSD 3-Clause "New" or "Revised" License
838 stars 88 forks source link

Slicing `dask-awkward` array in too many dimensions raises `StopIteration` #2302

Closed masonproffitt closed 1 year ago

masonproffitt commented 1 year ago

Description of new feature

Trying to slice an Awkward Array in too many dimensions raises a helpful error:

>>> import awkward as ak, dask_awkward as dak
>>> a = ak.Array([1])
>>> a[:, 0]
Traceback (most recent call last):
  File "/home/user/iris-hep/src/awkward-1.0/src/awkward/contents/numpyarray.py", line 346, in _getitem_next
    out = self._data[where]
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/user/iris-hep/src/awkward-1.0/src/awkward/highlevel.py", line 951, in __getitem__
    out = self._layout[where]
  File "/home/user/iris-hep/src/awkward-1.0/src/awkward/contents/content.py", line 548, in __getitem__
    return self._getitem(where)
  File "/home/user/iris-hep/src/awkward-1.0/src/awkward/contents/content.py", line 589, in _getitem
    out = next._getitem_next(nextwhere[0], nextwhere[1:], None)
  File "/home/user/iris-hep/src/awkward-1.0/src/awkward/contents/regulararray.py", line 521, in _getitem_next
    nextcontent._getitem_next(nexthead, nexttail, advanced),
  File "/home/user/iris-hep/src/awkward-1.0/src/awkward/contents/numpyarray.py", line 348, in _getitem_next
    raise ak._errors.index_error(self, (head, *tail), str(err)) from err
IndexError: while attempting to slice

    <Array [1] type='1 * int64'>

with

    (:, 0)

at inner NumpyArray of length 1, using sub-slice (0).

Error details: too many indices for array: array is 1-dimensional, but 2 were indexed.

Whereas for dask-awkward arrays, you get this much less friendly uncaught StopIteration, coming from typetracer:

>>> da = dak.from_awkward(a, 1)
>>> da[:, 0]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/user/iris-hep/src/dask-awkward/src/dask_awkward/lib/core.py", line 920, in __getitem__
    return self._getitem_tuple(where)
  File "/home/user/iris-hep/src/dask-awkward/src/dask_awkward/lib/core.py", line 847, in _getitem_tuple
    return self._getitem_trivial_map_partitions(where)
  File "/home/user/iris-hep/src/dask-awkward/src/dask_awkward/lib/core.py", line 722, in _getitem_trivial_map_partitions
    meta = self._meta[metad]
  File "/home/user/iris-hep/src/awkward-1.0/src/awkward/highlevel.py", line 951, in __getitem__
    out = self._layout[where]
  File "/home/user/iris-hep/src/awkward-1.0/src/awkward/contents/content.py", line 548, in __getitem__
    return self._getitem(where)
  File "/home/user/iris-hep/src/awkward-1.0/src/awkward/contents/content.py", line 589, in _getitem
    out = next._getitem_next(nextwhere[0], nextwhere[1:], None)
  File "/home/user/iris-hep/src/awkward-1.0/src/awkward/contents/regulararray.py", line 521, in _getitem_next
    nextcontent._getitem_next(nexthead, nexttail, advanced),
  File "/home/user/iris-hep/src/awkward-1.0/src/awkward/contents/numpyarray.py", line 346, in _getitem_next
    out = self._data[where]
  File "/home/user/iris-hep/src/awkward-1.0/src/awkward/_nplikes/typetracer.py", line 456, in __getitem__
    dimension_length = next(iter_shape)
StopIteration

Ideally, this case should produce a message similar to the first example.

jpivarski commented 1 year ago

This issue might need to be on https://github.com/dask-contrib/dask-awkward. I think the fix is going to involve changes to that codebase, not this one.

agoose77 commented 1 year ago

Going by the stack-trace, we're raising a StopIteration exception if the indexing shape is out-of-bounds in typetracer. We should be raising an IndexError here, which would then be caught and re-thrown by https://github.com/scikit-hep/awkward/blob/7fbe6b856cb87cfbe5a4827b7586852c49ff2bbe/src/awkward/contents/numpyarray.py#L346-L348

jpivarski commented 1 year ago

Oh, okay. So it is something to do here.