Open gabrielfougeron opened 1 year ago
Hi @gabrielfougeron,
Apologies for taking so long to answer - I was super busy with teaching a class on geometric data analysis in Oct.-Dec. 2022.
The solution here would be to increase the size of sigma: basically, the grid size should be tied to the size of the geometric domain, not the number of points. As far as I can tell, since your data points are drawn at random in the unit square, using a grid size that is equal to 0.1
should be about right.
What do you think?
Best regards, Jean
Hi @jeanfeydy
Thank you very much for your answer. Indeed, lowering the grid size does remove the error.
You wrote basically, the grid size should be tied to the size of the geometric domain, not the number of points.
This completely defeats my intuition. I assumed that the grid size should scale with the diameter of the support of the kernel. Of course, I'm technically using a Gaussian kernel whose support is the whole space but because of its rapid decay, its support can for all intents and purposes be assumed no bigger than a few standard deviations.
This standard deviation would vary with the number of points in a method like kernel density estimation.
Do you agree ? If not (which I assume), can you explain what I'm missing ?
Kind regards,
Gabriel
Hi,
I adapted the example available here to perform clustering using keops.
Here is the code:
The size of each of the point clouds is N = 512*512, which is probably a bit too big since I get the following error upon execution:
I am assuming that the boolean mask is actually store in memory, (not a LazyTensor), so I tried to use LazyTensor versions of x_centroids and y_centroids. This failed with the following error:
'<' not supported between instances of 'LazyTensor' and 'float'
Do you have a suggestion on how to proceed?