lessw2020 / Ranger21

Ranger deep learning optimizer rewrite to use newest components
Apache License 2.0
320 stars 44 forks source link

local variable 'neg_grad_ma' referenced before assignment when momentum_type is not "pnm" #34

Closed lechmazur closed 2 years ago

lechmazur commented 2 years ago

Bug in line 904 that requires adding "if self.momentum_pnm:" before pnmomentum is calculated.

I'm trying to diagnose Nans I get with a large learning rate after some batches and make Ranger21 perform as the base AdamW (if it's possible in the first place).