inception-project / inception-external-recommender

Get annotation suggestions for the INCEpTION text annotation platform from spaCy, Sentence BERT, scikit-learn and more. Runs as a web-service compatible with the external recommender API of INCEpTION.
Apache License 2.0
40 stars 17 forks source link

spacy example does not work as advertised #19

Closed reckart closed 3 years ago

reckart commented 3 years ago

When putting the spacy example shown in the README file into a py file and running it, it produces an error:

OSError: [E941] Can't find model 'en'. It looks like you're trying to load a model from a shortcut, which is deprecated as of spaCy v3.0. To load the model, use its full name instead:

nlp = spacy.load("en_core_web_sm")
DavidHuebner commented 3 years ago

Sorry, to intervene here, but I believe that this issue needs another little change. Right now, when I execute the SpacyPosClassifier pipelines, then all pos tags are empty strings. The reason are twofold:

  1. The tagger pipelines needs the existence of word embeddings. Hence, one needs to manually call self._model.get_pipe("tok2vec")(doc) before calling self._model.get_pipe("tagger")(doc)
  2. The information about the PoS tags in Spacy v3 is now saved in spacy_token.tag_ instead of spacy_token.pos_.

Please also see my discussion here: https://github.com/explosion/spaCy/issues/7105#

BTW: Unfortunately, the spacy test is not capturing the problem because all PoS-values are empty strings, i.e. not None and hence, the following assert is not triggered.

for prediction in predictions:
    assert getattr(prediction, PREDICTED_FEATURE) is not None
jcklie commented 3 years ago

Thank you for the report. I changed the recommender and the test, I hope it works now.