Luolc / AdaBound

An optimizer that trains as fast as Adam and as good as SGD.
https://www.luolc.com/publications/adabound/
Apache License 2.0
2.91k stars 331 forks source link

AttributeError: no attribute 'base_lrs' #13

Open akaniklaus opened 5 years ago

akaniklaus commented 5 years ago

Thank you very much for sharing this impressive work. I am somehow receiving the following error:

    for group, base_lr in zip(self.param_groups, self.base_lrs):
AttributeError: 'AdaBound' object has no attribute 'base_lrs'
Luolc commented 5 years ago

That's very weird... https://github.com/Luolc/AdaBound/blob/6fa826003f41a57501bde3e2baab1488410fe2da/adabound/adabound.py#L43

Once AdaBound is instantiated, the above line would be executed. At least the error shouldn't be has no attribute 'base_lrs'.

Could you provide more details of the environment info? Python version, OS info, and how did you install AdaBound (via pip or direct copy) and which version?

akaniklaus commented 5 years ago

I am getting the error with Python3, PyTorch 1.0 on Ubuntu 18.04 (where I installed Anaconda3)

Yes I am not having any error with PyTorch 0.4 so it should be related with the environment

On Wed 6. Mar 2019 at 03:29, LoLo notifications@github.com wrote:

That's very weird...

https://github.com/Luolc/AdaBound/blob/6fa826003f41a57501bde3e2baab1488410fe2da/adabound/adabound.py#L43

Once AdaBound is instantiated, the above line would be executed. At least the error shouldn't be has no attribute 'base_lrs'.

Could you provide more details of the environment info? Python version, OS info, and how did you install AdaBound (via pip or direct copy) and which version?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Luolc/AdaBound/issues/13#issuecomment-469942135, or mute the thread https://github.com/notifications/unsubscribe-auth/AHMEpE1X7ZvPpJ_LQkpy6bCZbs2FFGGbks5vTyf3gaJpZM4beJKp .

Luolc commented 5 years ago

But it's still very strange because there're only a few minor changes of torch.optim.Adam since the release of version 0.4.1 (I've tested this version and it's fine) on Jul 27, 2018, and they shouldn't have any effects on our code.

I am thinking that there might be some changes in the base class and therefore affect our code. I will have a check with the commits history of PyTorch since 0.4.1 release.

akaniklaus commented 5 years ago

By the way, would you like me to integrate decoupled weight decay into your work?

Also, this paper is very relevant to yours: https://openreview.net/forum?id=BJll6o09tm

I was actually planning to decay the partial parameter for the same purpose.

On Wed 6. Mar 2019 at 08:31, LoLo notifications@github.com wrote:

But it's still very strange because there're only a few minor changes https://github.com/pytorch/pytorch/commits/cf235e0894eb78b741afb98f15da62673ab41dfe/torch/optim/adam.py of torch.optim.Adam since the release of version 0.4.1 (I've tested this version and it's fine) on Jul 27, 2018, and they shouldn't have any effects on our code.

I am thinking that there might be some changes in the base class and therefore affect our code. I will have a check with the commits history of PyTorch since 0.4.1 release.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Luolc/AdaBound/issues/13#issuecomment-469999796, or mute the thread https://github.com/notifications/unsubscribe-auth/AHMEpJaGxmXUc4A9D67pkNSEmtq3sT3Kks5vT27qgaJpZM4beJKp .

Luolc commented 5 years ago

Sure. :D Contributions are welcome as long as you can guarantee the compatibility with current code.

akaniklaus commented 5 years ago

It would be compatible but just let me know if I should directly switch it to 'decoupled weight' decay, which is proven better especially for Adam; or enable the control of it through a boolean?

I can also make a seperate function called AdaBoundW for the decoupled weight decay version. Let me know how you would prefer me to proceed on that.

On Wed, Mar 6, 2019 at 9:09 AM LoLo notifications@github.com wrote:

Sure. :D Contributions are welcome as long as you can guarantee the compatibility with current code.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Luolc/AdaBound/issues/13#issuecomment-470009435, or mute the thread https://github.com/notifications/unsubscribe-auth/AHMEpFOF04rOJPyRAtL3ufpaVqLp3WnLks5vT3engaJpZM4beJKp .

Luolc commented 5 years ago

It seems that the decoupled weight decay version for Adam is not integrated into Adam class directly. If so, I think we can just follow them to make a separate function.

akaniklaus commented 5 years ago

Ok will do that.

On Wed 6. Mar 2019 at 12:33, LoLo notifications@github.com wrote:

It seems that the decoupled weight decay version for Adam is not integrated into Adam class directly. If so, I think we can just follow them to make a separate function.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Luolc/AdaBound/issues/13#issuecomment-470074575, or mute the thread https://github.com/notifications/unsubscribe-auth/AHMEpPfPUwiryKT-jIOwm1Qg562s9Ms1ks5vT6ehgaJpZM4beJKp .

akaniklaus commented 5 years ago

Hello @Luolc, FYI, they want to add this to pytorch/contrib if you can make a pull request: https://github.com/pytorch/pytorch/issues/17560

Luolc commented 5 years ago

Thanks for the reminder! I'll have a look.

I'm now busy with my undergraduate thesis... I'll be available after my graduation, in July.