Closed bionicles closed 2 years ago
I agree it would be interesting to provide slices Wasserstein distance and gromov in POT but I dont' see it happening will not be done before the NeurIPS deadline.
In the meantime we have a solution for wasserstein 1D here https://pythonot.github.io/gen_modules/ot.lp.html#ot.lp.emd2_1d In order to do sliced Wasserstein it requires a projection with random directions and a loop which is a few lines of code. If you have the same number of samples it's just a sorting and it's even faster.
We also have a PR not yet merged but with the implementation for gromov-1D here https://github.com/PythonOT/POT/pull/129
Note that you are welcome to provide a PR if you implement it before us.
Hi, I noticed that a sliced_wasserstein implementation has been added to the code and also appears already in the documentation, but it is not yet available if installing via pip. Any idea about when that will be available?
Thanks so much
On the next release ;).
Feel free to use the master version of POT that you can install wiyth
pip install -U https://github.com/PythonOT/POT/archive/master.zip
Thanks for that. Looking forward to the next release then
Sliced wasserstein has been added to release 0.8, closing this issue
Hello, thanks a lot for this!
I was checking the results from the old sliced wasserstein distance at this point in time and the results of the new one which you just released. I realized that calling the old and new get_random_projections
function gives different results (fixing numpy random number generator, and accounting for the transposed matrix). The sliced_wasserstein_distance
function instead gives the same result, once the projections are fixed. Just wanted to understand if that behavior is due to handling the randomness in different way (so that the reproducibility has been lost between commits) or if there is any substantial difference. Thanks a lot!
yes it is due to the fact that we had to find a way to have reproducible functions across backends (we now have a number of backends). The way the matrix are stored can also change the result but all the rest is exactly the same algorithm.
That is great, thanks a lot for clarifying!
To handle larger problems faster with less memory usage, would it be possible to add ot.sliced module?
Papers: sliced wasserstein, sliced gromov-wasserstein, cramer-wold distance does an integral approach which might be nice for performance, anchor energy and anchor wasserstein also seem promising
A sliced version of FGW would be useful right away for molecular biology as we could measure distances between large molecules ( think antibodies, vaccines, virus proteins, etc etc ) which can have upwards of hundred thousand points with many dimensions of features (mass,charge etc) ... likewise with point clouds for self driving cars and other assorted geometric applications, this could enable POT to be useful in production
this recent paper Statistical and Topological Properties of Sliced Probability Divergences is optimistic about the quality of results from such metrics