It seems that there is something wrong in your Sinkhorn Algorithm.
features = torch.nn.functional.normalize(features, dim=1, p=2)
head = torch.nn.functional.normalize(head, dim=1, p=2)
In my understanding, the dim of "head" is $d\times N$. It is not right to normalize along the dim 1. The right way is to normalize the head along with dim 0. Do I understand right?
Thanks your code.
It seems that there is something wrong in your Sinkhorn Algorithm.
In my understanding, the dim of "head" is $d\times N$. It is not right to normalize along the dim 1. The right way is to normalize the head along with dim 0. Do I understand right?