Using NeuroNer for transfer Learning

Franck-Dernoncourt / NeuroNER

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

http://neuroner.com

MIT License

1.69k stars 475 forks source link

Using NeuroNer for transfer Learning #29

Open ArmandGiraud opened 7 years ago

ArmandGiraud commented 7 years ago

Hello Frank, I saw you contributed to write another article based on this code, transfer learning in the NER context, Is the mentioned extension included in the git already?

I want to train a new entity type (Job Title), and I have a large training set for this entity (30K sentences) , but it has been automatically generated and is quite noisy. I also have a smaller test (5k sentences) where I have manually annotated the Job titles. I want to learn on the noisy data first and then transfer on the manually anonotated data. Is transfer learning relevant in this context?

I would like to know how to configure Neuroner to perform transfer on this new dataset. Is it simply training further a pre-trained model with a higher learning rate and fewer epochs? How do you restrict the parameteres transfer to specific layers only?

Thanks for your help. BR Armand

Franck-Dernoncourt commented 7 years ago

Is the mentioned extension included in the git already?

Yes!

Is transfer learning relevant in this context?

Sounds like a use case for it

Is it simply training further a pre-trained model with a higher learning rate and fewer epochs?

The learning rate doesn't have to be higher. In fact, one often reduces the learning rate as the model has been trained on more epochs. The number of epochs pretty much depends on the performances you observe.

How do you restrict the parameteres transfer to specific layers only?

https://github.com/Franck-Dernoncourt/NeuroNER/blob/master/src/parameters.ini:

# specify which layers to reload from the pretrained model
reload_character_embeddings = True
reload_character_lstm = True
reload_token_embeddings = True
reload_token_lstm = True
reload_feedforward = True
reload_crf = True

ArmandGiraud commented 7 years ago

Frank, Thank you very much for this fast and complete answer!

The learning rate doesn't have to be higher.

Indeed, the Learning rate should be reduced instead ...

I'll try it and I hope I can come to you later with a pretrained model I can share :).

Thanks again for your commitment and this great work. Armand

chunniunai220ml commented 6 years ago

Hi ,ArmandGiraud, I wonna to know,Have you finished the experiment about transfer learning with your job?how about the result?baecuae I have another task with a new entity type (samll dataset for title data),I plan to use transfer learning. Looking forward to your reply. Lulu

ArmandGiraud commented 6 years ago

Hello Lulu, Using Transfer Learning helped with small amount of in-domain data, but after some sufficient ratio of manually tagged data, the contribution of transfer from noisy data was no longer helping.

The best option I found for training a new entity type with small amount of in-domain labeled data is using Active Learning as in this paper, you can easily reached decent level of F-Scores (i.e. ~70) with a few hundred sentences.

Another option is to make use of the Semantic Web (Dbpedia, freebase) to automatically build large corpus, thanks to anchor links... This eventually worked for me.

Good Luck Regards, Armand