scikit-hep / hist

Histogramming for analysis powered by boost-histogram
https://hist.readthedocs.io
BSD 3-Clause "New" or "Revised" License
127 stars 26 forks source link

[BUG] Error when sorting IntCat axis if values >1e8 #497

Open lfarinaa opened 1 year ago

lfarinaa commented 1 year ago

hist.sort() calls axis._ax.index(), and produces an error if the IntCat values are too high.

Example:

import numpy as np
from hist import Hist  # version 2.6.3

hTest = (
    Hist.new.IntCat(
        np.arange(10_000_000, 10_000_200, 1), name="pdgCode", label="pdgCode"
    )
    .Weight()
    .fill(np.arange(10_000_000, 10_000_200, 1))
)
hTest.sort(0)  # works

hTest = (
    Hist.new.IntCat(
        np.arange(100_000_000, 100_000_200, 10), name="pdgCode", label="pdgCode"
    )
    .Weight()
    .fill(np.arange(100_000_000, 100_000_200, 10))
)
print(hTest)  # works
hTest.plot()  # works
hTest.sort(0)  # error: KeyError: '100000010 not in axis'

I would say it looks like some sort of an int>float>int problem. It may look like a corner case, but in fact nuclei PDG codes have values in the ~1e9 range. It would be great if sorting could be supported also in this case.