machelreid / subformer

The code for the Subformer, from the EMNLP 2021 Findings paper: "Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers", by Machel Reid, Edison Marrese-Taylor, and Yutaka Matsuo
https://arxiv.org/abs/2101.00234
MIT License
14 stars 3 forks source link