PythonOT / POT

POT : Python Optimal Transport
https://PythonOT.github.io/
MIT License
2.43k stars 502 forks source link

`ot.unbalanced.sinkhorn_knopp_unbalanced` change between `POT==0.9.4` and `POT==0.9.5`. #691

Open atong01 opened 2 days ago

atong01 commented 2 days ago

Describe the bug

The output of ot.unbalanced.sinkhorn_knopp_unbalanced has changed quite significantly between versions POT==0.9.4 and POT==0.9.5.

CI Test failing in torchcfm package across all python and OS versions tested: https://github.com/atong01/conditional-flow-matching/actions/runs/11873892894/job/33098878771?pr=145

Behavior has been consistent since at least 0.9.1.

To Reproduce

Steps to reproduce the behavior:

[alextong:~] [base] % pip install POT==0.9.4 -q
[alextong:~] [base] % python
Python 3.9.7 (default, Sep 16 2021, 08:50:36)
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import ot
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.],[1., 0.]]
>>> ot.unbalanced.sinkhorn_knopp_unbalanced(a, b, M, 1., 1.)
array([[0.51122814, 0.18807032],
       [0.18807032, 0.51122814]])
>>>
[alextong:~] [base] % pip install POT==0.9.5 -q
[alextong:~] [base] % python
Python 3.9.7 (default, Sep 16 2021, 08:50:36)
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import ot
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.],[1., 0.]]
>>> ot.unbalanced.sinkhorn_knopp_unbalanced(a, b, M, 1., 1.)
array([[0.32205361, 0.1184769 ],
       [0.1184769 , 0.32205361]])

Environment (please complete the following information):

Output of the following code snippet:

Python 3.9.7 (default, Sep 16 2021, 08:50:36)
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import platform; print(platform.platform())
macOS-10.16-x86_64-i386-64bit
>>> import sys; print("Python", sys.version)
Python 3.9.7 (default, Sep 16 2021, 08:50:36)
[Clang 10.0.0 ]
>>> import numpy; print("NumPy", numpy.__version__)
NumPy 1.26.4
>>> import scipy; print("SciPy", scipy.__version__)
SciPy 1.13.1
>>> import ot; print("POT", ot.__version__)
POT 0.9.5
clbonet commented 2 days ago

Hi, The default behavior is now with the KL regularization instead of the entropy. If you add reg_type="entropy", you should recover the result of the 0.9.4 version.

>>> import ot
>>> a = [0.5, 0.5]
>>> b = [0.5, 0.5]
>>> M = [[0, 1], [1, 0]]
>>> ot.unbalanced.sinkhorn_knopp_unbalanced(a, b, M, 1, 1, reg_type="entropy")
array([[0.51122814, 0.18807032],
       [0.18807032, 0.51122814]])
atong01 commented 2 days ago

Hi Clément,

Thank you! May I ask why this changed and / or why it is preferred?

I don't really have a preference but this seems to change (somewhat significantly?) a number of my projects and it would be nice to know what I should be using.

Thanks, Alex

rflamary commented 1 day ago

The entropy has a weird impact that it will promote creation of mass which for unbalanced is a bit counter intuitive. For instance the mass of the plan can be greater than the mass of each of the marginals... It was not the standard in any of the unalanced papers to the bets of my knowledge and we originally implemented is to have similar regularization than sinkhorn paper but it wa sprobably a bad idea.

When using the KL you shrink toward the rank one plan (of mass 1 if both marginals sum to 1) so there is no creation of mass.

I am sorry if it breaks things, we usually promote stability in POT (and API is stable) and should have added a warning for the release. We will be more careful next time.

atong01 commented 1 day ago

Thank you for the explanation Rémi!

Seems like a better default indeed. Agreed, I would have preferred a warning, but thanks for the upgrade :)