Closed lmorillas closed 4 years ago
Hello @lmorillas,
Thanks for writing.
The developed lemmatizer needs a correct inferred tag to extract the proper lemma. In your first example, the word "compro" is tagged as VERB, so it is well lemmatized. But in the second and third example, it is tagged as NOUN and PROPN (PROPER NOUN) so the lemmatizer cannot properly extract its corresponding lemma. The same is true, for example, with the words "manzanas" and "peras", which are tagged as ADJ (adjective) but both are NOUNs. If they were tagged as NOUN, they would have been properly lemmatized as "manzana" and "pera" respectively.
An accurate implementation of the tagger is out of the scope of this package.
I'm open to suggestions.
With the PRON - VERB structure it works:
But not with a VERB + ADJ structure
It fails with a more complex structure too:
Have you tested it?