kelvinxu / arctic-captions

961 stars 350 forks source link

Questions or bugs in the adam optimizer #36

Open ysjakking opened 7 years ago

ysjakking commented 7 years ago

From line 84,85 and 97,98 of the optimizer.py , we can see the b1 and b2 here are correspond to '1-b1' and '1-b2' respectively of the original adam paper, i.e., 'Adam: A Method for Stochastic Optimization" Kingma et al. (ICLR 2015)'. However, I am confused by line 90,91. I think the code should be :
fix1 = 1. - (1-b1)(i_t) fix2 = 1. - (1-b2)(i_t), instead. Because the b1 and b2 should also be switched to '1-b1' and '1-b2' constantly during the implementation.

I wonder how the authors use the adam optimizer when conducting experiments on MSCOCO.

elliottd commented 7 years ago

I have implemented a more recent version of the Adam optimizer here

I hope this helps.