Closed rob-tay closed 5 years ago
First a word of warning; following a long chain of discussion on https://gitter.im/HSF/PyHEP-histogramming , the loc
, iloc
, and __getitem__
method on counts
will be removed. Aghast was overstepping its scope as a format converter, and this job will be done by histogramming libraries likke boost-histogram and hist. We've been talking a lot about what a good syntax for that would be.
Staring at your example, you had me convinced for a while that this was a bug, but actually, it's not. It comes from the fact that some ("low-level") array views include under/overflow bins and some ("high-level") only include them if you ask, where you ask for them to be put. You used the low-level array view. The selection "cat2"
puts all other categories (there's only one, "cat1"
) into an overflow bin. To use the high-level view, do counts[:]
(no overflow) or counts[:numpy.inf]
(with overflow).
>>> h.loc["cat2"].counts[:]
array([[ 99, 119, 109, 109, 95, 104, 102, 106, 112, 122]])
>>> h.loc["cat2"].counts[:numpy.inf]
array([[ 99, 119, 109, 109, 95, 104, 102, 106, 112, 122],
[ 9, 25, 29, 35, 54, 67, 60, 84, 80, 94]])
If we're using these selections to perform format conversions, we'll need to transition to some method that would be used "internally" for the format conversions only (not users).
Unless I'm misunderstanding how the slicing should work, it seems like slicing on a categorical axis when there is also another axis, does not have the desired effect:
If I wanted to select only the counts for "cat2" I assumed this would work:
However, that just produces a histogram with the counts for both categories: