Open liulei277 opened 6 months ago
What's more, we ran the following commands with the default examples/ResNet/main.py:
# mup
python main.py --load_base_shapes resnet18.bsh --lr 0.5 --width_mult 0.5
After running 10 epochs, the learning rate we obtain is 82.14%. It's different from the accuracy(92.78%) in your paper Table 12: ResNet on CIFAR10.
Hello! We tried to reproduce the experiment in your paper (Figure 16, ResNet on CIFAR-10 for different widths (compared to a base network). We made some modifications to examples/ResNet/main.py:
And we ran the following commands:
Then we got the following picture:![image](https://github.com/microsoft/mup/assets/118550746/d62cfa28-50b1-4cc9-b2cc-e3a6be91c01b)
Is there anything wrong in our implementation? Thanks.