Deep Classification, Embedding & Text Generation - Radford et al 2019

Computational-Content-Analysis-2020 / Readings-Responses

Repository for organising "exemplary" readings, and posting reponses.

6 stars 1 forks source link

Deep Classification, Embedding & Text Generation - Radford et al 2019 #40

Open jamesallenevans opened 4 years ago

jamesallenevans commented 4 years ago

Radford, Alec, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. “Language models are unsupervised multitask learners.” OpenAI Blog 1(8).

xpw0222 commented 4 years ago

This week's reading is pretty technical. I tried my best to get an idea of what is going on. They developed a multi-tasking language model GPT-2 which could be performed on 7 out of 8 language modeling test datasets.

What surprises me is the low accuracy of the result. As they admitted in the paper, it is "still far from usable". I am wondering whether the low performance is caused by the sacrifice for multi-tasking, or the NLP algorithms themselves are still overall underdeveloped.

vahuja92 commented 4 years ago

This paper also felt quite technical for me. I was fascinated with the GPT-2s ability to answer new questions based on a transcript of conversational questions and answers (COQA). Could you explain the mechanics (generally) for how the model is able to achieve this, especially when simple heuristic rules are not apparent?

kdaej commented 4 years ago

It is important to have a model that can generalize to various texts from a different domain. However, I believe it entails tradeoff because it can be harder to train the machine when there is a large variation in the data. I was curious about how multitask learning can overcome this issue or whether it is an issue at all.