Open dilyabareeva opened 5 months ago
I find it safe to look at all parameters and randomize everything, which we do currently. If it is a trainable parameter, it will show up in model.parameters() assuming correct pytorch behaviour. So I don't think this is an issue. Right?
Sounds safe to me! I would just add a small test to check if this method also works for other common nn.modules
Problem: Currently we are only testing Conv and Linear layers in the Randomization metrics. For those, there is a reset_parameter() built in torch, which we could use instead. What about other torch.nn modules? Does our method work, for example, for transformer architectures?
Solution: