TL;DR

While BERT hides one token (like a word) with a mask and fills it in, Span-BERT hides consecutive tokens (like an idiom) and fills them in. In addition, it does not perform Next Sentence Prediction and only makes the user input a single sentence. It outperforms BERT in many tasks. £(foot ball) = LMLM(football) + LsBo (football)

Why it matters:

Paper URL

https://arxiv.org/abs/1907.10529

Submission Dates(yyyy/mm/dd)

2019/07/24

Authors and institutions

Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S. Weld, Luke Zettlemoyer, Omer Levy

Allen School of Computer Science & Engineering, University of Washington, Seattle, WA
Computer Science Department, Princeton University, Princeton, NJ
Allen Institute of Artificial Intelligence, Seattle
Facebook AI Research,

AkiraTOSEI / ML_papers

SpanBERT: Improving Pre-training by Representing and Predicting Spans #110

TL;DR

Why it matters:

Paper URL

Submission Dates(yyyy/mm/dd)

Authors and institutions

Methods

Results

Comments