sinkhorn-rebasin incompatible with default torchvision VGG

Hello @epistoteles, thanks for using our library.

From a design perspective, I can assure you that RebasinNet does not make any assumptions regarding the number of layers. Therefore, my hypothesis for your issue is that you might be experiencing a floating-point precision problem. The more layers and classes you have, the greater the impact. The technical explanation for this is that the permutation transformation is implemented as the product $P_i Wi P{i-1}^T$, and this accumulates errors in the less significant decimal digits (RebasinNet L59). This kind of implementation of the permutation transformation is standard in the community.

Besides that, I don't expect you to encounter any other issues with vgg11. Since you haven't provided any code, I tried to reproduce it on my side in the following Colab. There, I provide an example of a successful re-basin for the original vgg11 from torchvision (1000 classes and three linear layers). I also show in the Colab that the functional equivalence remains up to 5 decimal digits for 32-bit precision and 14 decimal digits for 64-bit precision.

fagp / sinkhorn-rebasin

sinkhorn-rebasin incompatible with default torchvision VGG #9