Closed adam-ra closed 3 years ago
The lemmatisation of the contracted forms such as “n't” was very useful, it made it straighforward to recognise that “cannot”, “can't” and even “cant” were forms of the same lemmas.
Thanks for reporting, sorry you're having trouble with this. We are aware of this issue and working on it, please see https://github.com/explosion/spaCy/issues/7014.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Spacy 2.3,
en-core-web-lg
“I can't go”: (orth / lemma)
Spacy 3.0.2,
en-core-web-lg
Similarly for “We don't like it.”, “I cannot do that.”, “He won't survive.”
This will break some systems dependent on lemmas or patterns, e.g. for negation discovery. Changes such as this are surprising as these models have exactly the same name and have been trained on the same corpus as far as I understand.