Some multi-node bug fixes.

vturrisi / solo-learn

solo-learn: a library of self-supervised methods for visual representation learning powered by Pytorch Lightning

MIT License

1.41k stars 182 forks source link

Some multi-node bug fixes. #246

Closed DanielShalam closed 2 years ago

DanielShalam commented 2 years ago

Hi, Thank you for providing this repository. I found it really helpful!

I noticed some errors while training with multi node, and suggest some possible fixes.

First, in 'base.py' method, the calculation of 'num_training_steps' should be changed to:

Second, in 'solo/args/utils.py', the base lr calculation should be changed to:

Thanks!

DonkeyShot21 commented 2 years ago

Thanks for reporting this. We didn't have access to multi-node machines in the past but we are now trying to catch up with that. Feel free to send a PR if you have the hardware to properly test.