HomebrewNLP / Olmax

HomebrewNLP in JAX flavour for maintable TPU-Training
BSD 2-Clause "Simplified" License
45 stars 5 forks source link

fix(shampoo): don't debias stat #95

Closed ClashLuke closed 1 year ago

ClashLuke commented 1 year ago

Very important fix.\ Reduces loss by 5%. Previously, this was the case. However, fixing the debiasing broke this property. Re-enabling it is necessary.