explosion / spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python
https://spacy.io
MIT License
29.92k stars 4.39k forks source link

Loss error on validation set during NER training #3827

Closed glaey closed 5 years ago

glaey commented 5 years ago

Hello, Currently i'm trying to train a NER model to recognise a single new entity on custom data. The issue I have in performing hold-out training is to retrieve the loss function on the validation set in order to check if the model is over-fitting after some epochs. I've tried this https://github.com/explosion/spaCy/issues/3272 but the results are increasing too much (close to p=r=f1=100 at the end) meaning that even with sgd=None it is still performing some training on the validation set. The alternative I am using to keep track of the fitting capacity of the model is to evaluate the precision/recall/F1 value after each epoch. Is there something already existing such as the Loss argument in update method? If not, what alternative did you find?

I have to mention that i created a new NER model with only the tokenizer and added the ner process with only the new label to the pipe afterward. That is probably why this method is not callable for my model https://spacy.io/api/entityrecognizer#get_loss .

Your Environment

honnibal commented 5 years ago

Hm, hold up a second:

I've tried this #3272 but the results are increasing too much (close to p=r=f1=100 at the end) meaning that even with sgd=None it is still performing some training on the validation set.

That shouldn't be possible? If you do parser.model.to_bytes(), you can serialize to byes and verify that the weights aren't changing. To make double sure, you can pass a dummy optimizer function:

def dummy_optimizer(weights, gradient, key=None):
    return None

This should make certain that no optimization can occur.

glaey commented 5 years ago
            for batch in spacy.util.minibatch(validation_set, size=128):
                # Split the batch in texts and annotations
                texts = [model.make_doc(text) for text, annotation in batch]
                annotations = [GoldParse(model.make_doc(text), entities=annotation['entities'])
                               for text, annotation in batch]
                ner.update(texts, annotations, sgd=None, losses=losses_validation)

I'll check your method but when i run my training with the code above, my results are reaching 100% accuracy in recall and precision and without it it remains at 80-82 for f1. So i'm assuming that the update with sgd=None still performs the gradient descent and calculates the weights.

dswah commented 5 years ago

I'm seeing the same thing. How did you resolve it @glaey ?

glaey commented 5 years ago

Hello @dswah , Unfortunately i do not have the time to investigate further, so i did not resolve it :/ I mostly tried to use the other measures to keep track of the descent after each epoch, such as the https://spacy.io/api/scorer#init . I agree it is not great but i guess it was enough for my project to evaluate the status of the training (roughly).

lock[bot] commented 5 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.