Closed PromptExpert closed 6 years ago
How to enable averaging attention networks?
Oh sorry, I didn't see your issue here. If you are still interested, the option is
--transformer-decoder-autoreg average-attention
How to enable averaging attention networks?