Open antipisa opened 6 years ago
Your second example is identical to the first, can you check it?
It is identical except for the first two lines I added.
import daz
daz.set_ftz()
daz.set_daz()
Setting denormals as zero causes pandas categorical indexing to break.
I'm not especially familiar with denormals, or why treating them as zero is desirable and something we should support, can you fill out a bit more of what exactly is going on? Additionally / alternatively a PR is welcome if you know what is needed.
Setting the denormals are zero and flush to zero flags will convert subnormal numbers to zero. Because pandas categorical indexing is relying on the behavior of subnormal floats, it causes t.loc[t.index.categories[0], :] to break since it cannot locate the first float interval. Categorical labels should not behave this way--you should bit cast your floats to integers if the index is a categorical interval. It would also improve performance of interval slicing. See #https://github.com/numpy/numpy/issues/4581
I'm currently on a windows machine which evidently dax
won't install on, but am I understanding correctly that with the flags set, the issue is that np.nextafter(0., np.inf)
will return 0?
*daz not dax. Yes that is correct. It treats subnormals as zero.
Pandas indexing should not rely on subnormal floats behavior inside categorical data. Please bit cast your floats to integers when computing categorical labels: https://github.com/pandas-dev/pandas/blob/648ca95af696266b18ded6bfc5327d0666e3ad23/pandas/core/indexes/interval.py#L56
The following is an example of integer slicing with floating point interval endpoints that should return the first slice of the table:
However, this fails:
since the default behavior for floating endpoints forces the interval index to be cast into an integer slice. This is not ideal.