Closed puli83 closed 6 years ago
It's possible that these problems are caused by the lookup tables – for some languages, spaCy currently ships with more complex, rule-based lemmatizers. All other languages are currently covered by lookup tables, which aren't as reliable, and sometimes also contain mistakes. You can find a few similar discussions in the feat/lemmatizer
tag.
If you've come across a mistake that can be fixed by updating the lookup table, you can always submit a PR to fr/lemmatizer.py
. We'd also love to transition all lemmatizers over to a rule-based approach in the future (like English).
Merging this with #2668!
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Hi,
it's not a big problem to explain. I lemmatize with spacy this sentence " Logiquement, l' ASSÉ veut présenter les candidats aux prochaines élections."
ROOT of this sentence is "présenter", which is a VERB in his infinitive form. Why I get "poster" as lemma? This is strange. Treetagger correctly lemmatize this sentence.
Do you have any idea about what it is happening?
Your Environment