HomebrewNLP / Olmax

HomebrewNLP in JAX flavour for maintable TPU-Training
BSD 2-Clause "Simplified" License
45 stars 5 forks source link

fix(optimizer): debias in correct direction #94

Closed ClashLuke closed 1 year ago

ClashLuke commented 1 year ago

a bunch of fixes, including removal of an information leak and correct debiasing.\ previous runs from #92 cannot be trusted as the better convergence was likely because of the information leak. still keeping the normalization, as it's probably better