dask-contrib / dask-awkward

Native Dask collection for awkward arrays, and the library to use it.
https://dask-awkward.readthedocs.io
BSD 3-Clause "New" or "Revised" License
58 stars 17 forks source link

Emulate in-place operators like `+=` by replacing `_meta` #500

Open jpivarski opened 3 months ago

jpivarski commented 3 months ago

Like scikit-hep/awkward#3084, there's been a request to allow dask_array += something by replacing the content of the dask_array. For example,

>>> import dask_awkward as dak
>>> import awkward as ak
>>> one = dak.from_awkward(ak.Array(range(20)), npartitions=4)
>>> two = dak.from_awkward(ak.Array(range(100, 120)), npartitions=4)
>>> one += two

raises

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jpivarski/irishep/awkward/src/awkward/_operators.py", line 75, in func
    return ufunc(self, other, out=(self,))
  File "/home/jpivarski/irishep/dask-awkward/src/dask_awkward/lib/core.py", line 1625, in __array_ufunc__
    return _map_partitions(
  File "/home/jpivarski/irishep/dask-awkward/src/dask_awkward/lib/core.py", line 1944, in _map_partitions
    meta = map_meta(fn, *args, **kwargs)
TypeError: map_meta() got an unexpected keyword argument 'out'

which is confusing to users. If dask_awkward.Array had

    def __iadd__(self, other):
        self._meta = (self + other)._meta
        return self

and similar, then one += two would change one in place. I'm told that dask.array and dask-histogram already do this.

The implementation I've suggested above, in which dask-awkward replaces its _meta in __iadd__, is a little different from what I'm planning for Awkward (scikit-hep/awkward#3084), which would replace its layout. Maybe there are bad consequences, like losing typetracer reports. Can a _meta change in place? Maybe it would be better to do this:

    def __iadd__(self, other):
        self._meta += other._meta
        # also propagate other's typetracer reports into self?
        return self

after scikit-hep/awkward#3084 has been implemented.

martindurant commented 3 months ago

You would need to change the meta, the graph and the node key (.name), I think.

lgray commented 3 months ago

Yes, Martin is correct.