Loading a tf_gpt2 model from local folder

minimaxir / aitextgen

A robust Python tool for text-based AI training and generation using GPT-2.

https://docs.aitextgen.io

MIT License

1.84k stars 220 forks source link

Loading a tf_gpt2 model from local folder #42

Open mmagithub opened 4 years ago

mmagithub commented 4 years ago

Hi, It may be a trivial question, I am wondering how can we load a tf_gpt2 model from local folder. I tried: ai = aitextgen(tf_gpt2="124M")

But it ignores the 124M I created and looked for the model on Google servers. The problem is for security reason, the cluster cannot connect to the servers, I have to download the model locally and copy the folder to a cluster directory to load and fine-tune?

Any suggestion ?

Thanks, Marawan

minimaxir commented 4 years ago

The tf_gpu parameter is only intended for importing the base model.

If you are using an existing TensorFlow-based GPT-2 model, use the CLI converter to convert it to PyTorch: https://docs.aitextgen.io/gpt-2-simple/

I should add a note for that explicitly in the Model loading section.