AkiraTOSEI / ML_papers

ML_paper_summary(in Japanese)
5 stars 1 forks source link

T: Very Deep and Light-weight Transformer #38

Open AkiraTOSEI opened 4 years ago

AkiraTOSEI commented 4 years ago

TL;DR

A lightweight and efficient transformer, DeLighT (Deep and Light-weight Transformer), is proposed. The key point is block-wise scaling, where more parameters are used in deeper layers, and DExTra, an improved version of DeFINE that increases the number of dimensions once the channels are grouped together. higher accuracy can be achieved with a smaller computational complexity. image

Why it matters:

Paper URL

https://arxiv.org/abs/2008.00623

Submission Dates(yyyy/mm/dd)

2020/08/03

Authors and institutions

Sachin Mehta, Marjan Ghazvininejad, Srinivasan Iyer, Luke Zettlemoyer, Hannaneh Hajishirzi

Methods

Results

Comments