storpipfugl / pykdtree

Fast kd-tree implementation in Python
GNU Lesser General Public License v3.0
221 stars 46 forks source link

Inconsistent 'invalid' result value #35

Open QuLogic opened 6 years ago

QuLogic commented 6 years ago

I'm not sure if this is a bug or not, but the 'invalid' value is different depending on the query:

>>> import numpy as np
>>> a = np.arange(30.0)
>>> kdt = KDTree(a)
>>> kdt.query(np.arange(5) + 0.5)
(array([0.5, 0.5, 0.5, 0.5, 0.5]), array([0, 1, 2, 3, 4], dtype=uint32))
>>> kdt.query(np.arange(50) + 0.5)
(array([ 0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,
        0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,
        0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  1.5,  2.5,  3.5,
        4.5,  5.5,  6.5,  7.5,  8.5,  9.5, 10.5, 11.5, 12.5, 13.5, 14.5,
       15.5, 16.5, 17.5, 18.5, 19.5, 20.5]), array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 15, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 29, 29, 29, 29,
       29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29],
      dtype=uint32))

everything okay above.

So if we set distance_upper_bound, then invalid points are np.inf and len(data_pts):

>>> kdt.query(np.arange(50) + 0.5, distance_upper_bound=10)
(array([0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,
       0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,
       0.5, 0.5, 0.5, 0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5,
       inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf]), array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 15, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 29, 29, 29, 29,
       29, 29, 29, 29, 29, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30],
      dtype=uint32))

But if we pass np.nan, the invalid point is DBL_MAX and UINT_MAX:

>>> kdt.query(np.array([np.nan]))
(array([1.34078079e+154]), array([4294967295], dtype=uint32))
QuLogic commented 6 years ago

For reference, SciPy's cKDTree returns (inf, len(data_pts)) for np.nan.

storpipfugl commented 6 years ago

The intention is to be drop-in compatible with SciPy's CKDTree so this is a bug