Closed bilal2vec closed 4 years ago
Converting a pytorch checkpoint to tf works with
model = GPT2LMHeadModel.from_pretrained('gpt2-xl')
model.save_pretrained('./')
model = TFGPT2LMHeadModel.from_pretrained('./', from_pt=True)
model.save_pretrained('./out')
If you can tell me where to upload the TF checkpoint to, I'll open up a pull request
Hi @bkkaggle thanks for pointing this out! @julien-c could you maybe help out here:
While the model:
"gpt2-xl": "https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-xl-pytorch_model.bin",
does exist in PyTorch. It does not exist for TF 2. Could we add it as well?
🐛 Bug
Information
Model I am using (Bert, XLNet ...): TFGPT2LMHeadModel
The colab notebook works for all model sizes except for gpt2-xl, where it throws an error. It looks like it can't download the correct checkpoint from the model name (gpt2-xl)
I tried running the colab notebook with other gpt2-models and they all work.
Stack trace:
Language I am using the model on (English, Chinese ...): English
The problem arises when using:
See colab: https://colab.research.google.com/drive/12gEGdxUjyVLBSUjkjngAWiE_ENIUIV8o
The tasks I am working on is:
Finetuning gpt2-xl on wikitext2
To reproduce
Run the colab notebook,
Expected behavior
All gpt2 model sizes work except for gpt2-xl
Environment info
transformers
version: master