How is the Loss function calculated in spacy NER??

HarshMultani commented 4 years ago

I want to extract organization name from addresses and for the same i am annotating the organization name in the training dataset (addresses) as "B-org", "I-org" and "L-org". After training I am getting losses around 2000. Can someone explain me how is this loss calculated and also give some ideas on how I can tweak my code so that this loss gets reduced.

P.S. While testing the same model with another dataset. I am getting P/R/F values :-

{'p': 96.875, 'r': 86.11111111111111, 'f': 91.1764705882353} - for "B-org" {'p': 93.54838709677419, 'r': 93.54838709677419, 'f': 93.54838709677419} - for "I-org" {'p': 85.71428571428571, 'r': 80.0, 'f': 82.75862068965519} - for "L-org"

Also while training I am using "en_core_web_lg" pretrained model and then training above it with my dataset which is annotated using my labels and it does not contain any labels that were a part of the pretrained model..

Your Environment

Operating System :- Windows 10
Python Version Used :- 3.7.0
spaCy Version Used: :- 2.2.0

svlandeg commented 4 years ago

You can find the calculation of the loss for the NER (and parser) component here: https://github.com/explosion/spaCy/blob/v2.3.x/spacy/syntax/nn_parser.pyx#L566

What is causing your loss to be relatively high, is the fact that the loss is not divided by the number of examples. So it can be high, while still having a pretty good trained model.

Your approach of measuring F-score on a development test set, provides a better clue on how well your model is doing. Basically what you want to do, is cut off training when your training loss decreases further but your F-score on the dev test starts dropping: this would be the point of overfitting.

You mentioned you use "en_core_web_lg" but then retrain the NER model with your own labels. If the pretrained entities are of no interest to you, you could remove the pretrained NER component from the pipeline entirely before training, so you can start with a clean slate (you'll ofcourse have to create a new one with nlp.create_pipe("ner") and add that to your pipeline)

nolanding commented 3 years ago

I'm using Spacy 2.3.5, is there some way to exclude the component which I want to retrain?

polm commented 3 years ago

@nolanding Hey, sorry to hear you're having trouble figuring stuff out. I'm not sure how your question is related to this thread, maybe try opening a new thread in Discussions with more details about what you're trying to do?

DSLituiev commented 3 years ago

can you point to the latest location of the loss function? the url seems broken

svlandeg commented 3 years ago

I've updated the link above to work for 2.3, and for 3.0 you can find the code here: https://github.com/explosion/spaCy/blob/v3.0.x/spacy/pipeline/transition_parser.pyx#L461

github-actions[bot] commented 2 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

explosion / spaCy

How is the Loss function calculated in spacy NER?? #5392

Your Environment