AkiraTOSEI / ML_papers

ML_paper_summary(in Japanese)

5 stars 1 forks source link

It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners #116

Open AkiraTOSEI opened 3 years ago

AkiraTOSEI commented 3 years ago

TL;DR

Based on PET, which reads multiple patterned masked sentences and treats the output as a label, They improved it to be able to make multiple token predictions and surpasses GPT-3. GPT3 is a language model with billions of parameters, but it uses only about 1/1000th of that. SuperGLUE Performance

Why it matters:

Paper URL

https://arxiv.org/abs/2009.07118

Submission Dates(yyyy/mm/dd)

2020/09/15

Authors and institutions

Timo Schick, Hinrich Schütze

Center for Information and Language Processing, LMU Munich, Germany
Sulzer GmbH, Munich, Germany

Methods

Results

Comments