dhlee347 / pytorchic-bert

Pytorch Implementation of Google BERT
Apache License 2.0
591 stars 179 forks source link

update optim.py #22

Open zihangJiang opened 5 years ago

zihangJiang commented 5 years ago

fix weight decay in param_optimizer to agree with the original (hugging face's) implementation.

(Current implementation seems to apply weight decay of 0.01 to all parameters, since "n not in no_decay" is always True.)