rusty1s / pytorch_cluster

PyTorch Extension Library of Optimized Graph Cluster Algorithms
MIT License
813 stars 146 forks source link

Radius CPU computation with KDTree.query() when number of neighbors is higher than max_num_neighbors #126

Open aboulch opened 2 years ago

aboulch commented 2 years ago

Hello,

In the radius search CPU, you are using:

tree = scipy.spatial.cKDTree(x.detach().numpy())
_, col = tree.query( y.detach().numpy(), k=max_num_neighbors, distance_upper_bound=r + 1e-8)

It seems that there is no guaranty that returned indices are randomly selected if there are more points within range r than max_num_neighbors.

Here is a sample code:

data = np.random.rand(100000, 3)
tree = KDTree(data)
distances, ids = tree.query([0.5,0.5,0.5], k=32, distance_upper_bound=0.2)
print(distances.max())
---> 0.044963456313535384 (reproducible if launched several times)

It seems that it searches for the knn and return only the neighbors within r. The right function may be query_ball_point?

aboulch commented 2 years ago

The GPU version does as expected.

rusty1s commented 2 years ago

Are you using an older version of torch-cluster? In the current version, we no longer make use of scipy for this, see https://github.com/rusty1s/pytorch_cluster/blob/master/torch_cluster/radius.py.

aboulch commented 2 years ago

Thanks for qucik response. I was not on the right version of the code documentation. I use the 1.5.9 version of the code see bellow

In an IPython session:

In [5]: import torch_cluster

In [6]: torch_cluster.__version__
Out[6]: '1.5.9'

In [7]: from torch_cluster import radius

In [9]: import torch

In [10]: data = torch.rand(1000000, 3)

In [11]: query = torch.tensor([[0.5,0.5,0.5]])

# Compute distance of the further neighbor

# CPU
In [12]: (data[radius(data, query, r=0.2, max_num_neighbors=32)[1]] - 0.5).norm(dim=1).max()
Out[12]: tensor(0.0607)

In [13]: (data[radius(data, query, r=0.2, max_num_neighbors=64)[1]] - 0.5).norm(dim=1).max()
Out[13]: tensor(0.0701)

In [14]: (data[radius(data, query, r=0.2, max_num_neighbors=128)[1]] - 0.5).norm(dim=1).max()
Out[14]: tensor(0.0824)

# GPU
In [15]: (data[radius(data.cuda(), query.cuda(), r=0.2, max_num_neighbors=32)[1]] - 0.5).norm(dim=1).max()
Out[15]: tensor(0.1985)

In [16]: (data[radius(data.cuda(), query.cuda(), r=0.2, max_num_neighbors=64)[1]] - 0.5).norm(dim=1).max()
Out[16]: tensor(0.2000)

In [17]: (data[radius(data.cuda(), query.cuda(), r=0.2, max_num_neighbors=128)[1]] - 0.5).norm(dim=1).max()
Out[17]: tensor(0.2000)

CPU and GPU do not give the same results.

aboulch commented 2 years ago

Hello, Problem could be that nanoflann does not produce results in a random order when lookning for the neighbors in a given radius. This is due the tree traversal process. A solution could be to shuffle the index list?

radius_cpu.cpp l94

std::vector<std::pair<size_t, scalar_t>> ret_matches;
size_t num_matches = mat_index.index->radiusSearch(
    y_data + i * y.size(1), r * r, ret_matches, params);

// proposition
std::random_shuffle(ret_matches.begin(), ret_matches.end());

for (size_t j = 0; j < std::min(num_matches, (size_t)max_num_neighbors); j++) {
            out_vec.push_back(x_start + ret_matches[j].first);
            out_vec.push_back(i);
}
rusty1s commented 2 years ago

Yes, our GPU implementation is non-deterministic while the CPU one is deterministic (it will pick the closest points). Do you think this is a problem? I am happy to add some form of random shuffling to the CPU implementation (at best controllable via an input argument?). Let me know if you have interest in contributing this feature!