jwkanggist / SSL-narratives-NLP-1

거꾸로 읽는 self-supervised learning in NLP
27 stars 2 forks source link

[2주차] Latent Retrieval for Weakly Supervised Open Domain Question Answering #3

Open SeongkukCho opened 2 years ago

SeongkukCho commented 2 years ago

Keywords

Pre-training for Retriever

TL;DR

A method of jointly fine-tuning the retriever and reader in an end-to-end manner for the question-answer dataset after pre-training the retriever

Abstract

Recent work on open domain question answering (QA) assumes strong supervision of the supporting evidence and/or assumes a blackbox information retrieval (IR) system to retrieve evidence candidates. We argue that both are suboptimal, since gold evidence is not always available, and QA is fundamentally different from IR. We show for the first time that it is possible to jointly learn the retriever and reader from question-answer string pairs and without any IR system. In this setting, evidence retrieval from all of Wikipedia is treated as a latent variable. Since this is impractical to learn from scratch, we pre-train the retriever with an Inverse Cloze Task. We evaluate on open versions of five QA datasets. On datasets where the questioner already knows the answer, a traditional IR system such as BM25 is sufficient. On datasets where a user is genuinely seeking an answer, we show that learned retrieval is crucial, outperforming BM25 by up to 19 points in exact match.

Paper link

https://arxiv.org/abs/1906.00300

Presentation link

https://docs.google.com/presentation/d/1ZoOwYp_qWSZz7W8X6nLyQvA1ON7dLnSOAVdn58ZU9h0/edit#slide=id.p5

video link

https://youtu.be/MypoV0xAn18

SeongkukCho commented 2 years ago

Issue 1. ICT masking과 MLM of BERT image contrastive learning에 batch 사이즈는 매우 중요합니다!!

1) 영수님 의견

2) 재욱님 의견

3) 유경님 의견

@ ICT Model code: https://github.com/google-research/language/blob/master/language/orqa/models/ict_model.py

jwkanggist commented 2 years ago

성국님 노트 감사합니다! @SeongkukCHO