EntityRecognizer Predict function is returning spacy.syntax.stateclass.StateClass instead of tuple of scores and tensors as mentioned in Doc and ner.get_loss gives attribute error that there is no attribute.

explosion / spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python

https://spacy.io

MIT License

29.82k stars 4.37k forks source link

EntityRecognizer Predict function is returning spacy.syntax.stateclass.StateClass instead of tuple of scores and tensors as mentioned in Doc and ner.get_loss gives attribute error that there is no attribute. #4959

Closed jaingaurav3 closed 3 years ago

jaingaurav3 commented 4 years ago

How to reproduce the behaviour

nlp = spacy.load("en_core_web_sm") doc_test = nlp("hi, we are in London") ner_test = nlp.get_pipe('ner') ner_test.predict(doc_test) #this returns --> [<spacy.syntax.stateclass.StateClass at 0x295dd8603f0>]

ner_test.get_loss() AttributeError: 'spacy.pipeline.pipes.EntityRecognizer' object has no attribute 'get_loss'

Your Environment

Operating System: Windows 10
Python Version Used: '3.7.1 (default, Nov 24 2018, 22:14:32) [MSC v.1912 64 bit (AMD64)]'
spaCy Version Used: '2.2.3'
Environment Information: Anaconda3

adrianeboyd commented 4 years ago

Hi, thanks for the report! Sorry, it looks like the docs are out-of-date here for both EntityRecognizer and DependencyParser, which are based on Parser. Until we have a chance to update the docs, your best bet is to look at the code in syntax/nn_parser.pyx, e.g., there's get_batch_loss() instead of get_loss():

https://github.com/explosion/spaCy/blob/9c08d9baa31622e9e9daff37a9774774e42d8778/spacy/syntax/nn_parser.pyx#L422-L463

You can also see how set_annotations() processes the output from predict() (states or beams instead of (scores, tensors)):

https://github.com/explosion/spaCy/blob/9c08d9baa31622e9e9daff37a9774774e42d8778/spacy/syntax/nn_parser.pyx#L354-L376

Is there something specific you're trying to achieve?

jaingaurav3 commented 4 years ago

Hi @adrianeboyd thanks for responding.

Yes, I am looking to calculate the loss for validation data during spacy training for custom ner model. For that I am looking for a way out. I tried to create a dummy optimizer and passed that to sgd as mentioned in : https://github.com/explosion/spaCy/issues/3272. But this is giving an error

--> 502 get_grads.alpha = sgd.alpha 503 get_grads.b1 = sgd.b1 504 get_grads.b2 = sgd.b2

AttributeError: 'function' object has no attribute 'alpha'

Passing sgd=None instead of dummy_optimizer does not give this error but in that case, as per f1_score, precision and recall, spacy model seems to be overfitting because running spacy model without validation data gives F1_score of 90% on validation data after 10 iterations but training spacy model with validation data and sgd=None (just to get validation loss) is giving F1 score of 98% after 10 iterations like mentioned in (https://github.com/explosion/spaCy/issues/3827)

So I thought to use predict function separately on validation data and then use get_loss function to get the validation loss.

I'd appreciate if you could help me on this.

svlandeg commented 3 years ago

It looks like the original documentation issue was addressed by https://github.com/explosion/spaCy/pull/5223.

For more concrete coding help or discussions, it's probably better to open a new issue on the new discussions board, as the format of the issue tracker is less suited for that purpose ;-)

github-actions[bot] commented 2 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.