### Replace your custom init, if any
for param in model.parameters():
### If initializing manually with fixed std or bounds,
### then replace with same function from mup.init
# torch.nn.init.uniform_(param, -0.1, 0.1)
mup.init.uniform_(param, -0.1, 0.1)
If you don't specify a base shape, it will default to the shape of the target model, which is equivalent to SP even if you are using a MuOptimizer.
We didn't have the mup library when we first wrote the code for the MLP experiment -- you are right that we can use mup.init there. Line 139 to 141 are doing what mup.init does manually.
1) https://github.com/microsoft/mup/blob/main/examples/MLP/main.py#L61 If you don't specify a base shape file, then you are using standard parametrization,in the code,the optimizer will use the MuSGD?https://github.com/microsoft/mup/blob/main/examples/MLP/main.py#L257
2) why the init func not use the mup.init? https://github.com/microsoft/mup/blob/main/examples/MLP/main.py#L139