getkeops / keops

KErnel OPerationS, on CPUs and GPUs, with autodiff and without memory overflows
https://www.kernel-operations.io
MIT License
1.03k stars 65 forks source link

Block-sparsity functions do not support a very large number of clusters #277

Open gabrielfougeron opened 1 year ago

gabrielfougeron commented 1 year ago

Hi,

I adapted the example available here to perform clustering using keops.

Here is the code:

import torch
import pykeops

pi = 3.141592653589793238462643383279 # Overkill
N = 512*512
ndim = 2

dtype = torch.float32
device = torch.device("cuda:0")

x = torch.rand((N,ndim)).to(dtype=dtype,device=device)
y = torch.rand((N,ndim)).to(dtype=dtype,device=device)

x_LT = pykeops.torch.LazyTensor(x.view(1,N,ndim))
y_LT = pykeops.torch.LazyTensor(y.view(N,1,ndim))

sigma = N ** (-1./float(ndim))
grid_size = 2*sigma
max_dist = 4*sigma

x_labels = pykeops.torch.cluster.grid_cluster(x, grid_size)
x_ranges, x_centroids, x_cluster_weights = pykeops.torch.cluster.cluster_ranges_centroids(x, x_labels) 
x, x_labels = pykeops.torch.cluster.sort_clusters(x, x_labels)

y_labels = pykeops.torch.cluster.grid_cluster(y, grid_size)
y_ranges, y_centroids, y_cluster_weights = pykeops.torch.cluster.cluster_ranges_centroids(y, y_labels) 
y, y_labels = pykeops.torch.cluster.sort_clusters(y, y_labels)

distsq_LT = ((x_LT-y_LT)**2).sum(-1)
Ker_LT = ((-1/(2*(sigma**2)))*distsq_LT).exp()*(1/(2*pi*sigma**2)) # Gauss

D = ((x_centroids[None, :, :] - y_centroids[:, None, :]) ** 2).sum(2)
keep = D < max_dist*max_dist

ranges_ij = pykeops.torch.cluster.from_matrix(y_ranges, x_ranges, keep)

Ker_LT.ranges = ranges_ij

The size of each of the point clouds is N = 512*512, which is probably a bit too big since I get the following error upon execution:

File "/home/gfo/anaconda3/envs/ot-geomloss/lib/python3.10/site-packages/pykeops/torch/cluster/matrix.py", line 119, 
in from_matrix I.t()[keep.t()]
RuntimeError: nonzero is not supported for tensors with more than INT_MAX elements,   file a support request

I am assuming that the boolean mask is actually store in memory, (not a LazyTensor), so I tried to use LazyTensor versions of x_centroids and y_centroids. This failed with the following error: '<' not supported between instances of 'LazyTensor' and 'float'

Do you have a suggestion on how to proceed?

jeanfeydy commented 1 year ago

Hi @gabrielfougeron,

Apologies for taking so long to answer - I was super busy with teaching a class on geometric data analysis in Oct.-Dec. 2022. The solution here would be to increase the size of sigma: basically, the grid size should be tied to the size of the geometric domain, not the number of points. As far as I can tell, since your data points are drawn at random in the unit square, using a grid size that is equal to 0.1 should be about right.

What do you think?

Best regards, Jean

gabrielfougeron commented 1 year ago

Hi @jeanfeydy

Thank you very much for your answer. Indeed, lowering the grid size does remove the error.

You wrote basically, the grid size should be tied to the size of the geometric domain, not the number of points. This completely defeats my intuition. I assumed that the grid size should scale with the diameter of the support of the kernel. Of course, I'm technically using a Gaussian kernel whose support is the whole space but because of its rapid decay, its support can for all intents and purposes be assumed no bigger than a few standard deviations. This standard deviation would vary with the number of points in a method like kernel density estimation.

Do you agree ? If not (which I assume), can you explain what I'm missing ?

Kind regards,

Gabriel