How do I modify to run with gpt2-xl (1558M) parameters?

graykode / gpt-2-Pytorch

Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation

MIT License

963 stars 225 forks source link

How do I modify to run with gpt2-xl (1558M) parameters? #17

Open jmarsil opened 4 years ago

jmarsil commented 4 years ago

Any help would be greatly appreciated!

jasonzhou1 commented 4 years ago

I was able to find the s3 bucket locations of the pretrained GPT2 models here: https://github.com/huggingface/transformers/blob/master/transformers/modeling_gpt2.py (provided by HuggingFace).

To make this work, just download gpt2-xl model instead:

curl --output gpt2-pytorch_model.bin https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-xl-pytorch_model.bin

paulbricman commented 4 years ago

@jasonzhou1 I only get gibberish output with the XL model, worse than the small version. Have you actually had any luck with it?

Update: also tried the other models linked to in the script you referenced, also without luck.

ZJiaBin commented 4 years ago

@jasonzhou1 I only get gibberish output with the XL model, worse than the small version. Have you actually had any luck with it?

Before you try gpt-2-ml model,some parameters in gpt-2-Pytorch/GPT2/config.py should be modified , like n-heads=25 , n_embd=1600 , n_layer=25, or you can see details here https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-xl-config.json