Closed gotobelieve closed 4 years ago
Any specific reason for making these parameters non-trainable? I believe their computation overheads should be negligible. Typically all models parameters are set to be trainable by default and I'm not sure whether fixing these paramteres would lead to a performance drop.
Thanks, I will try it later.
hi, according to the code in the transformer_layer.py, these two variable are trainable parameters, I'm not sure what's reason behind this setting?