astrofrog / fast-histogram

:zap: Fast 1D and 2D histogram functions in Python :zap:
BSD 2-Clause "Simplified" License
266 stars 28 forks source link

Fix test suite #40

Closed astrofrog closed 4 years ago

astrofrog commented 5 years ago

The tests were failing due to a deprecation of behavior in hypothesis

astrofrog commented 5 years ago

There is a failure that looks like a real bug - basically if I do:

import numpy as np
from fast_histogram import histogram1d

values = np.array([2.220446e-13, 0.000000e+00], dtype=np.float32)

result = histogram1d(values, bins=1, range=(-2.2204460492503135e-06, 0.0))

then in the C code, tx ends up being 0.0 instead of 2.220446e-13 for the first value, resulting in result being [0]. I don't quite understand why this happens since 2.220446e-13 should be convertable to double without any loss.

manodeep commented 5 years ago

@astrofrog Can you print from the C extensions what histogram1d sees as the elements of the array values?

astrofrog commented 5 years ago

@manodeep - if I do:

    while (size--) {

      printf("dataptr[0] = %f\n", dataptr[0]);

      tx = *(double *)dataptr[0];

      printf("tx = %f\n", tx);

      if (tx >= xmin && tx < xmax) {
        ix = (tx - xmin) * normx * fnx;
        count[ix] += 1.;
      }

      dataptr[0] += stride;
    }

I get:

dataptr[0] = 0.000000
tx = 0.000000
dataptr[0] = 0.000000
tx = 0.000000
manodeep commented 5 years ago

I think the issue might be that the values is a float32 array but you are accessing it as a float64 array. What happens if you change the data type of the values array?

manodeep commented 5 years ago

Or is there a conversion to float64 happening within the extension?

astrofrog commented 5 years ago

@manodeep - the Numpy C iterator is supposed to cast to double on-the-fly:

  dtype = PyArray_DescrFromType(NPY_DOUBLE);
  iter = NpyIter_New(x_array,
                     NPY_ITER_READONLY | NPY_ITER_EXTERNAL_LOOP | NPY_ITER_BUFFERED,
                     NPY_KEEPORDER, NPY_SAFE_CASTING, dtype);
astrofrog commented 5 years ago

@manodeep - oops ignore me, the issue was to do with the use of %f in the printf, if I use %e it all works fine. Sorry for the noise!

manodeep commented 5 years ago

@astrofrog Ahh okay! Glad that's sorted :)

Just in case this helps, if you use %g as the print format, then the appropriate %f or %e will be used depending on the value being printed out.

astrofrog commented 4 years ago

Closing in favor of https://github.com/astrofrog/fast-histogram/pull/46