Sinkhorn unexpected result

maberrospi commented 5 months ago

Hi! I am trying to use the Sinkhorn function that you provide with a feature similarity matrix calculated using the inner product of two feature matrices. My goal is to achieve a soft assignment matrix since every feature vector corresponds to a node. The following snippet is a smaller example to demonstrate my results.

import pygmtools as pygm

# Create two test arrays
t1 = np.array(
    [
        [1, 2, 3, 4, 5],
        [6, 7, 8, 9, 10],
        [11, 12, 13, 14, 15],
        [16, 17, 18, 19, 20],
        [21, 22, 23, 24, 25],
    ]
)
t2 = np.array(
    [
        [6, 7, 8, 9, 10],
        [1, 2, 3, 4, 5],
        [21, 22, 23, 24, 25],
        [16, 17, 18, 19, 20],
        [11, 12, 13, 14, 15],
    ]
)
# Calculate the node similarity matrix using the inner product
similarity_matrix = np.inner(t1, t2)

# Use Sinkhorn to get a doubly-stochastic matrix
S = pygm.sinkhorn(similarity_matrix, max_iter=10)

# Test what the equivalent Hungarian algorithm produces
hung = pygm.hungarian(similarity_matrix)

I plotted the matrices for an easier interpretation of the results: Sinkhorn: Hungarian:

The Hungarian result is as I expected since the features are matched but the Sinkhorn result is unexpected for me. I am aware that Sinkhorn should produce a continuous result but I was expecting something similar to the Hungarian. Is this the expected behavior? Am I perhaps thinking of this in a wrong way?

Thank you in advance!

rogerwwww commented 5 months ago

Hi, thanks for the interest of using our code! Seems that Sinkhorn does not converge in your case. According to my experience, you may have to tune the hyperparameter tau. For your case, seems that you are having relatively large numerical values for the similarity matrix, I would suggest trying some larger taus such as 10, 100, etc.

maberrospi commented 5 months ago

Hi, thank you for your quick response! I will try that out and let you know.

So far, I tried reducing the tau value based on the documentation which states 'Given a small tau, Sinkhorn performs more closely to Hungarian, at the cost of slower convergence speed and reduced numerical stability.', however this did not converge either.

You mention that I should tune tau since I have relatively large numbers. I tried using the cosine similarity instead of the inner product using the following:

t1_norm = np.linalg.norm(t1, axis=1)
t2_norm = np.linalg.norm(t2, axis=1)
t1_norm = t1_norm[..., np.newaxis]
t2_norm = t2_norm[np.newaxis, ...]
sim_matrix = sim_matrix / (t1_norm @ t2_norm)

This essentially produced a similarity matrix between 0-1. Specifically the following:

[[0.9649505 1. 0.92899793 0.93515346 0.9453431 ] [1. 0.9649505 0.99355912 0.99534144 0.99778241] [0.99778241 0.9453431 0.99889808 0.99955144 1. ] [0.99534144 0.93515346 0.99985557 1. 0.99955144] [0.99355912 0.92899793 1. 0.99985557 0.99889808]]

This should have theoretically helped the algorithm converge. However, it did not and produced the following:

Is there any rule of thumb on the scale that the values of the assignment matrix should have to help with convergence?

rogerwwww commented 5 months ago

Hi there, again, you should tune the tau value to have the expected output matrix. You can check if the row & column sums are both 1 to see if the algorithm converges.

Also one more thing, it seems that you may increase the number of interations (max_iter, default is 10), because, as you have noticed, lower tau needs more itereations to converge

maberrospi commented 5 months ago

Hi, thank you for your useful recommendations! The algorithm converged using tau =10, max_iter~=200 and the initial inner product similarity. The row and column sums are very close to 1.

However, this is a toy problem that I used to test the algorithm. Is there perhaps any rule of thumb on what tau should be used depending on the scale of your data? The reason for asking is because I do not have ground truth for my actual application and I cannot easily determine if the correspondence is correct, as I did for this toy problem.

Again thank you in advance for your help!!

rogerwwww commented 5 months ago

It is terrific to know it is working! I personally often use some tau in the range of 0.01 to 1, but it actually depends on the scale of your embedding vectors and your applications.

rogerwwww commented 5 months ago

Normalizing your embedding vectors may also help in choosing some consistent taus across different applications, though.

maberrospi commented 5 months ago

Okay thank you for all your help! One last question. The way to determine if the algorithm converged is if the rows and colums sum to approximately 1. Is that correct?

rogerwwww commented 5 months ago

Yes you are right.

Thinklab-SJTU / pygmtools

Sinkhorn unexpected result #100