fastai / swiftai

Swift for TensorFlow's high-level API, modeled after fastai
Apache License 2.0
457 stars 51 forks source link

Updating LAMB optimizer to v3 #13

Closed Shashi456 closed 5 years ago

Shashi456 commented 5 years ago

While LAMB v1 had the debiasing step as follows : image LAMB v3, no longer comprises of the same: image

Also even pre-PR, I observed that

let num = debias1 * state[StateKeys.avgGrad]!

while it should've been :

let num =  state[StateKeys.avgGrad]! / debias1 

since the pythonic version, would look something like:

 step = (exp_avg/debias1) / ((exp_avg_sqr/debias2).sqrt()+eps)
sgugger commented 5 years ago

Thanks!