scikit-hep / hist

Histogramming for analysis powered by boost-histogram
https://hist.readthedocs.io
BSD 3-Clause "New" or "Revised" License
128 stars 25 forks source link

[FEATURE] Float weights for dask histograms #493

Closed alexander-held closed 1 year ago

alexander-held commented 1 year ago

Being able to use floats as weights is convenient, but not supported with Dask.

import awkward as ak
import dask_awkward as dak
import hist.dask

x = ak.Array([0, 1, 2])
dx = dak.from_awkward(x, npartitions=1)

hist.dask.Hist.new.Reg(1, 0, 1).Weight().fill(dx, weight=0.5)
# AttributeError: 'float' object has no attribute 'ndim'

hist.Hist.new.Reg(1, 0, 1).Weight().fill(x, weight=0.5)
# this is fine

Describe the feature you'd like

Support for the above, allowing to use floats for weighting.

Describe alternatives, if any, you've considered

The following works:

hist.dask.Hist.new.Reg(1, 0, 1).Weight().fill(dx, weight=dak.ones_like(dx)*0.5)

cc @lgray (thanks for the workaround trick!)

lgray commented 1 year ago

This should be moved to dask-histogram. The issue is there, not in hist.

alexander-held commented 1 year ago

I don't know whether we can move issues from scikit-hep to dask-histogram, but I can open a new one there if needed. If this is out of scope for dask-histogram, it might still fit into hist as a higher level API that adds convenience features perhaps.

lgray commented 1 year ago

dask-histogram already supports floats/ints/str for other inputs, weight and sample should be no different! It should broadcast!

alexander-held commented 1 year ago

I opened https://github.com/dask-contrib/dask-histogram/issues/68 so we can close this here I think.