Closed petermartigny closed 6 years ago
Hi Peter,
thats a great idea and we'd be very interested to see how that would affect downstream NLP tasks!
I think the good news is that fine-tuning the language model should be very easy: you can load a pre-trained LM and then pass it to the LanguageModelTrainer
to fine-tune on your target domain corpus:
# load existing language model
language_model = LanguageModel.load_language_model('/path/to/language/model.pt')
# load target domain corpus
corpus: TextCorpus = TextCorpus('path/to/your/domain/corpus',
language_model.dictionary,
language_model.is_forward_lm,
character_level=True)
# pass the trained language model to the trainer, along with the new corpus
trainer = LanguageModelTrainer(language_model, corpus)
# continue training the model on the new corpus
trainer.train('./results', sequence_length=250, mini_batch_size=100, learning_rate=20)
The pre-trained language models we distribute are downloaded into ~/.flair/embeddings
when you first call them. So the big news forward model can be found at ~/.flair/embeddings/lm-news-english-forward-v0.2rc.pt
. You could try fine-tuning one of these on the target corpus.
With regards to the additional layers, I have to first study the ULMFit paper in greater detail (probably sometime next week). If you have any progress to share on this, we'd appreciate it!
Thanks for your answer Alan,
There are several interesting things in the ulmfit paper, I think the gradual unfreezing of layers could be first added to flair. I will look at it probably next week, there's a freeze() method in the fast.ai cpde that we could include here.
Hello Peter,
that's great! Please let us know if that works - we'd be happy to include it in Flair!
I am trying to fine-tune a language model on a target corpus but getting the following error TypeError: unsupported operand type(s) for /: 'str' and 'str'
_My script is as follows: from pathlib import Path from flair.data import Dictionary from flair.models import LanguageModel from flair.trainers.language_model_trainer import LanguageModelTrainer, TextCorpus
language_model = LanguageModel.load_language_model('./best-lm.pt')
corpus: TextCorpus = TextCorpus('./corpus', language_model.dictionary, language_model.is_forward_lm, character_level=True)
trainer = LanguageModelTrainer(language_model, corpus)
trainer.train('./results', sequence_length=250, mini_batch_size=100, learning_rate=20, maxepochs=1)
I would be happy to get assistance in resolving it
Hello @smutuvi you need to pass a Path (instead of string) to the corpus to indicate the path to the data folder, like this:
corpus: TextCorpus = TextCorpus(Path('./corpus'),
language_model.dictionary,
language_model.is_forward_lm,
character_level=True)
Hope this helps!
Thank you @alanakbik. It works!
Am also working on a Swahili LM. Will share it with you soon
Cool - a Swahili LM would be great to have in Flair! Look forward to hearing about your results!
Any idea regarding what should be the size of corpus for fine tuning , I am planning to fine-tune ' news forward ' model on social media corpus , so can you please provide some suggestions on corpus size , currently I am thinking of 50 million word corpus @alanakbik
Hi,
I've discovered the flair framework recently and the experience so far is great! Following what has been by Howard and Ruder with ULMFit, and others, I would be interested in fine-tuning the language models to custom datasets and then plug a custom layer to do some tasks.
I think I can work out the language model fine-tuning by downloading one of your pre-trained models and then use it as initialization of the language model training. However, for the downstream tasks, I wish I could first train on the e.g. classification layer, and then gradually fine-tune the language models layers.
Thank you very much for your help!