Open And233 opened 1 week ago
Since these two wrap codes use momentum instead of betas and weights_decay, shall I get a version to combine with AdamW-like optimizer?
Since these two wrap codes use momentum instead of betas and weights_decay, shall I get a version to combine with AdamW-like optimizer?