jwkanggist / SSL-narratives-NLP-1

거꾸로 읽는 self-supervised learning in NLP
27 stars 2 forks source link

[3주차] Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks #4

Open Dien-ES opened 2 years ago

Dien-ES commented 2 years ago

Keywords

RoBERTa, Language model, Domain-adaptive pretraining, Task-adaptive pretraining

TL;DR

Multiphase adaptive pretraining with domain and task corpus offers large gains in task performance.

Abstract

Language models pretrained on text from a wide variety of sources form the foundation of today’s NLP. In light of the success of these broad-coverage models, we investigate whether it is still helpful to tailor a pretrained model to the domain of a target task. We present a study across four domains (biomedical and computer science publications, news, and reviews) and eight classification tasks, showing that a second phase of pretraining indomain (domain-adaptive pretraining) leads to performance gains, under both high- and low-resource settings. Moreover, adapting to the task’s unlabeled data (task-adaptive pretraining) improves performance even after domain-adaptive pretraining. Finally, we show that adapting to a task corpus augmented using simple data selection strategies is an effective alternative, especially when resources for domain-adaptive pretraining might be unavailable. Overall, we consistently find that multiphase adaptive pretraining offers large gains in task performance.

Paper link

https://aclanthology.org/2020.acl-main.740/

Presentation link

https://drive.google.com/file/d/1mziQEteSwHxLZ6Jb0vCsX3w1KpUGGmzA/view?usp=sharing

video link

https://youtu.be/2W_2vHamLYo

jwkanggist commented 2 years ago

@yukyunglee 님께서 추가로 소개해주신 같이 읽으면 좋은 논문

Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting

AVocaDo: Strategy for Adapting Vocabulary to Downstream Domain

Task-adaptive Pre-training and Self-training are Complementary for Natural Language Understanding

@jwkanggist 추가 추천한 논문

Just Rank: Rethinking Evaluation with Word and Sentence Similarities