openai / gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"
https://openai.com/blog/better-language-models/
Other
22.58k stars 5.53k forks source link

Why gpt-2 could apply to other tasks without fine-tune? #154

Closed guotong1988 closed 5 years ago

guotong1988 commented 5 years ago

Thank you!

gionapaolini commented 5 years ago

Imho it's because of the dataset they have used.

It's huge and somewhat clean (they have taken specific text that had more than a specific score from reddit I believe, not sure). This means that in the dataset there were a lot example of QA (question-answer), TL;DR: (summarization), and so on.

Since the model is very good at learning patterns, if you set your input in a specific way (for example for summarization you concatenate TL;DR: at the end of the input), the model will recognize the pattern from the training dataset (TL;DR:) and try to generate sentences that follow the same pattern.

MrKrzYch00 commented 5 years ago

My observations are:

So imo, due to the various input it was trained on, it can't be perfect and there is a risk of unwanted output. I tried fine-tuning. Yes, it helps a lot to be on topic but only if your input follows your samples in a way as well. If not, your trained data will result in unwanted output then. Said so, I had not a lot data to train it with.

guotong1988 commented 5 years ago

Thank you!

shayneoneill commented 5 years ago

I'm not gonna lie. The part of my brain where i stored my philsophy minor is currently running around with its hair on fire. This thing is terrifying, and I'm not sure if I can articulate why.

Its like someone actually built Searles chinese room, and it started asking for a payrise.