Open MaxH1996 opened 2 years ago
This indicates a bug to me, since the permuted model, pi(B), should behave exactly the same as the original model B. Can you test if B and pi(B) have equal outputs on equivalent batches?
I checked the outputs and they were not equal. I went back to check the ResNet20 I was using, and although it seemed like a pretty standard implementation, it had some slight deviations from the ResNet used in your code. These deviations were apparently the root cause. I changed to the ResNet20 used in the PyTorch implementation of Git Re-basin and now it behaves as expected.
Not sure though what was wrong with the ResNet I used. I'll look into that.
maybe you were using batchnorm?
Hi, I am currently trying to run some of the code on my own experiments and I am encountering somewhat of a strange problem. When I use weight matching on two trained ResNet20 models with 4x width-multiplier (CIFAR-10), the test accuracy of the permuted model drops significantly, as shown in the plot below. I was wondering if you have encountered this behavior before. I guess I went wrong somewhere, but I am not quite sure what exactly is causing this problem.
Thanks!