update optim.py - Githubissues

dhlee347 / pytorchic-bert

Pytorch Implementation of Google BERT

Apache License 2.0

591 stars 179 forks source link

Open zihangJiang opened 5 years ago

zihangJiang commented 5 years ago

fix weight decay in param_optimizer to agree with the original (hugging face's) implementation.

(Current implementation seems to apply weight decay of 0.01 to all parameters, since "n not in no_decay" is always True.)