ConnorJL / GPT2

An implementation of training for GPT2, supports TPUs
MIT License
1.42k stars 334 forks source link

character-level #28

Closed amacfie closed 4 years ago

amacfie commented 4 years ago

Is there a way to build a character-level model?

ConnorJL commented 4 years ago

This repo is not actively maintained and there are some rough edges, but roughly: You need to create a new encoder (Hugging Face has a good tokenizer library that's better than the one used here) that encodes chars into tokens, then you need to modify the create_tfrecords.py script to encode your text with the new encoder. Finally, change the "n_ctx" parameter to however many tokens are in your new char vocabulary and train the model on your new data.