Open henryiii opened 4 years ago
This is a bit tricky to implement; I've started it, but pybind11 doesn't provide runtime utilities for array access, and I don't want to generate 32 copies of this, so likely will miss the 1.0 target. I think that's fine, as no one has been too worried about missing this so far. The easy buffer access with .view() and such make it a bit less important.
Hi @henryiii @HDembinski ,
I assume the following is related, if not please correct me and I open a fresh new issue...
We noticed in the scope of our analysis that __getitem__
is a performance "hurdle" for high dimensional histograms (imagine: dataset axis of O(1000) dataset, category axis of O(100) categories and systematic axis of O(100) shifts).
I will put a snippet here, which makes the performance visible:
import boost_histogram as bh
h = bh.Histogram(
bh.axis.StrCategory([str(i) for i in range(100)]), # e.g. datasets
bh.axis.StrCategory([str(i) for i in range(100)]), # e.g. categories
bh.axis.StrCategory([str(i) for i in range(100)]), # e.g. systematics
bh.axis.Regular(100, 0, 500),
)
# let's fill a dummy value
h[...] = 1.0
# now the __getitem__ performance:
%timeit h[bh.loc("42"), bh.loc("42"), bh.loc("42"), :].view()
4.08 s ± 61.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit h.view()[h.axes[0].index("42"), h.axes[1].index("42"), h.axes[2].index("42"), :]
20.3 µs ± 669 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Currently we use the second option, since on a larger analysis scale with multiple of these huge histograms this results in a difference of O(hours) and O(seconds) for histogram manipulation, such as grouping datasets to physics processes. However the first one is (obviously) a lot more convenient to use.
I think this would be a major improvement, especially for the best usability of hist
and boost_histogram
for large-scale analysis.
Best, Peter
_at
_at_set
__getitem__
,__setitem__
(uses the above functions internally)