openai / finetune-transformer-lm

Code and model for the paper "Improving Language Understanding by Generative Pre-Training"
https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
MIT License
2.14k stars 499 forks source link

transfer learning code #18

Closed xuy2 closed 6 years ago

xuy2 commented 6 years ago

Have you released the code for language model training?

xuy2 commented 6 years ago

I'd like to use the code you provided to finish seq2seq task such as POS and Named Entity Recognition. However, the preprocessing way in your code is to divide the words which are not frequent. I don't know how to use the divided word to finish Named Entity Recognition... Therefore, I'd like to retrain the transfer language model by myself. Can I kindly know the training time?

wugh commented 6 years ago

I'm trying to training the language model on Chinese. Maybe we can discussion some experience.

madisonmay commented 6 years ago

@xuy2 we're using this base model for sequence labeling tasks over at our fork of the repo. Docs for the SequenceLabeler wrapper class are here. We've thought a little bit about how to deal with the problem of predictions being made at the subtoken level and opted to simply round predictions off to the nearest token as a post processing step. You can optionally disable this behavior in the config.

If you end up adapting the code in this repo you may want to consider adding one more attention block on top of the model to give the model some forward context, because as it currently stands future tokens are masked off from the model's representation of the sequence.

xuy2 commented 6 years ago

@madisonmay Thanks for your excellent work.

xuy2 commented 6 years ago

@wugh I sent you an e-mail to the address in your profile.