Franck-Dernoncourt / NeuroNER

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.
http://neuroner.com
MIT License
1.69k stars 476 forks source link

use pre-trained model on i2b2 2014 dataset #152

Open InternetMedical opened 4 years ago

InternetMedical commented 4 years ago

Dear Franck The NeuroNER is really a great work. You developers provided detailed answers to users, which helped me a lot when I encountered the same problem. But I still need your help.

In your paper, Transfer Learning for Named-Entity Recognition with Neural Network, you said:

'we apply transfer learning by training the parameters of the ANN model on the source dataset (MIMIC), and using the same ANN to retrain on the target dataset (i2b2 2014 or 2016) for fine-tuning.'

Can you tell me the details how you achieve it?

I used the pre-trained model, namely mimic_glove_spacy_bioes, to fine-tuning on i2b2 dataset. A part of params used for fine-tuning are set as: '--train_model=True --use_pretrained_model=True'

But I got error. 'AssertionError: The label B-BIOID does not exist in the pretraining dataset. Please ensure that only the following labels exist in the dataset: B-AGE, B-COUNTRY, B-DATE, B-DOCTOR, B-HOSPITAL, B-IDNUM, B-LOCATION_OTHER, B-PATIENT, B-PHONE, B-STATE, B-STREET, B-ZIP, E-AGE, E-COUNTRY, E-DATE, E-DOCTOR, E-HOSPITAL, E-IDNUM, E-LOCATION_OTHER, E-PATIENT, E-PHONE, E-STATE, E-STREET, E-ZIP, I-AGE, I-COUNTRY, I-DATE, I-DOCTOR, I-HOSPITAL, I-IDNUM, I-LOCATION_OTHER, I-PATIENT, I-PHONE, I-STATE, I-STREET, I-ZIP, O, S-AGE, S-COUNTRY, S-DATE, S-DOCTOR, S-HOSPITAL, S-IDNUM, S-LOCATION_OTHER, S-PATIENT, S-PHONE, S-STATE, S-STREET, S-ZIP'

It seems the labels in i2b2 dataset are different from the labels in MIMIC dataset. Can you tell me how to fine-tuning the pre-trained model, namely mimic_glove_spacy_bioes, on i2b2 dataset?