JisuHann / One-day-One-paper

Review paper

3 stars 0 forks source link

XLNet: Generalized Autoregressive Pretraining for Language Understanding #7

Open JisuHann opened 3 years ago

JisuHann commented 3 years ago

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Unsupervised representation learning on NLP

: Autoregressive (AR) language modeling and Autoencoding(AE)

Pretrain neural networks on large-scale unlabeled text corpora
Fine-tune the models or representations on downstream tasks

AR Language Modeling: estimate the probability/density distribution of a text corpus with an autoregressive model
AE Language Modeling: aims to reconstruct the original data from corrupted input ex. BERT
- Problem 1) Artificial symbols like [MASK] are absent form real data at finetuning time -> pretrain-finetune discrepancy
- Problem 2) not able to model the joint probability using the product rule

XLNet

: a generalized autoregressive method that leverage both AR & AE

maximize the expected log likelihood of a sequence all possible permutations of the factorization order
Autoregressive objective > does not suffer from pretrain-finetune discrepancy
integrates the segment recurrence mechanism and relative encoding scheme of Transformer-XL
reparameterize the Transformer(-XL) network