Luolc / AdaBound

An optimizer that trains as fast as Adam and as good as SGD.
https://www.luolc.com/publications/adabound/
Apache License 2.0
2.91k stars 330 forks source link

Don't work properly with higher lr #15

Closed Ocelot7777 closed 5 years ago

Ocelot7777 commented 5 years ago

I'm new in deep learning and I found the project works well with SGD but turns to be sth wrong with adabound.

When I start with lr=1e-3, it shows as below and break down: invalid argument 2: non-empty 3D or 4D (batch mode) tensor expected for input, but got: [1 x 64 x 0 x 27] at /pytorch/aten/src/THCUNN/generic/SpatialAdaptiveMaxPooling.cu:24

But seems to work right if I set lr to 1e-4 or lower. It confused me a lot. Any ideas?

python=3.6 pytorch=1.0.1 / 0.4