openai / gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"
https://openai.com/blog/better-language-models/
Other
22.58k stars 5.53k forks source link

Large model? #182

Closed alsosprachzar closed 5 years ago

alsosprachzar commented 5 years ago

The code in src/interactive_conditional_samples.py never calls the large model, it uses 124M. Is this correct?

purdue512 commented 5 years ago

That is my understanding. I downloaded the 774M model by accident. Then when I went to run it, there was an error something like "could not find Model 124M". After downloading the 124M model, it ran.

I would also be interested in running the src/interactive_conditional_samples.py against the 774M model. Is this an easy edit? (sorry if that's a stupid question).

tjsadiq commented 5 years ago

I made 3 copies of interactive_conditional_samples.py. First was as is, second I changed the 124M to 355M (two places) and the third to 774M. Running python3 src/interactive_conditional_samples.py --top_k 40 --length 800 124M: took 1:05 to generate 693 words 355M took 3:07 to generate 672 words 774M took 6:20 to generate 676 words So changing the model takes longer to run. Quality is hard to judge.

WuTheFWasThat commented 5 years ago

all you have to do is pass the extra flag --model_name 774M at the command line