rish-16 / gpt2client

✍🏻 gpt2-client: Easy-to-use TensorFlow Wrapper for GPT-2 117M, 345M, 774M, and 1.5B Transformer Models 🤖 📝
MIT License
372 stars 74 forks source link

774M and 1558M won't allow custom text. #20

Open alfonsoborello opened 4 years ago

alfonsoborello commented 4 years ago

It's erroneous to say that you can train your own text on the 774M and 1558M because the system won't allow it. Only the small and medium models allow that. Just FYI. Good work nonetheless. Could work on Google Cloud notebook paying a bit, but haven't checked yet. Won't do on Colab.

rish-16 commented 4 years ago

Hello there, Currently, this library is based on the original OpenAI gpt-2 repo. So, the larger models can only be used for inference instead of fine-tuning. With time, I'll be branching off from the original and creating the entire model from scratch which will allow fine-tuning for larger models as well.

Looking forward to your continued support!

hexive commented 4 years ago

Is the problem of finetuning the 1558M a GPU processing limitation or something else? Do you think running your client on an AWS instance with multiple GPU's would be worth the attempt?

alfonsoborello commented 4 years ago

The 1558M doesn't make much of a difference from what I experienced. Finetuning on the 355M is good enough―so much fanfare for nothing, total publicity stunt. GPT-2 seems pretty much stuck and can't get any better, remarkable enough though. It really depends on the training text you upload. I don't mean quality. Sometimes it comes out with some creative/funky text with as little as 40 step training but you have jack up the temperature sometimes over 1.5. GPT-2 can you give good snippets but it goes all over the place and can't finish a story. I tried to finetune on a with google cloud console but it gives me an error, I'm in SouthEast Asia at the moment, I don't know what the problem is. I don't use AWS but you should try to clone it and see what happens. It needs tons of RAM. When you upload previous checkpoints over 1 gig to re-finetune it crashes.