murthyrudra / NeuralNER

Implementation of Multilingual Neural NER
GNU General Public License v3.0
5 stars 2 forks source link

Where is the multilingual learning happening ? #2

Closed samarohith closed 5 years ago

samarohith commented 5 years ago

Hello sir, I didn't understand where the 'multilingual learning' is happening. Is it in the word embeddings? Also can you tell me what is the 'development file'. Thanks in advance

murthyrudra commented 5 years ago

Hi, the folders NeuralNERYang and NeuralNERALLShared implement multilingual learning. The word embeddings and CNN filters (acting on character sequence) are shared between languages in NeuralNERYang. In NeuralNERALLShared, the training of the model is done in a language independent way essentially sharing all layers between the two languages.

The 'development file' or sometimes the 'tune' or 'validation split' is only used to monitor the loss. After training every epoch on the training data, we evaluate the loss on the 'development file' and decide when to stop training.

samarohith commented 5 years ago

I see that the training language data is uploaded to the variable "--train". Where is the assisting language data uploaded? Also can you tell me , what is this "ner_tag_field" argument. I am actually new to pytorch, so please bear with me for asking silly questions!!!

murthyrudra commented 5 years ago

NeuralNERMono is for monolingual training. If you look at the other two folders you can see trainAux parameter to specify the train split of the auxiliary/assisting language. ner_tag_field specifies the column number of the ner tag whether second column has named entity labels or third column and so on.

samarohith commented 5 years ago

Thanks , for your response