samuela / git-re-basin

Code release for "Git Re-Basin: Merging Models modulo Permutation Symmetries"
https://arxiv.org/abs/2209.04836
MIT License
470 stars 40 forks source link

Test Accuracy Drop with Permuted Model #3

Open MaxH1996 opened 2 years ago

MaxH1996 commented 2 years ago

Hi, I am currently trying to run some of the code on my own experiments and I am encountering somewhat of a strange problem. When I use weight matching on two trained ResNet20 models with 4x width-multiplier (CIFAR-10), the test accuracy of the permuted model drops significantly, as shown in the plot below. output I was wondering if you have encountered this behavior before. I guess I went wrong somewhere, but I am not quite sure what exactly is causing this problem.

Thanks!

samuela commented 2 years ago

This indicates a bug to me, since the permuted model, pi(B), should behave exactly the same as the original model B. Can you test if B and pi(B) have equal outputs on equivalent batches?

MaxH1996 commented 2 years ago

I checked the outputs and they were not equal. I went back to check the ResNet20 I was using, and although it seemed like a pretty standard implementation, it had some slight deviations from the ResNet used in your code. These deviations were apparently the root cause. I changed to the ResNet20 used in the PyTorch implementation of Git Re-basin and now it behaves as expected.

Not sure though what was wrong with the ResNet I used. I'll look into that.

themrzmaster commented 2 years ago

maybe you were using batchnorm?