pablodms / spacy-spanish-lemmatizer

Spanish rule-based lemmatization for spaCy
MIT License
37 stars 6 forks source link

use spaCy lemma as fallback instead of token.text #2

Closed Langbraue closed 4 years ago

Langbraue commented 4 years ago

Hi, thank you for the repo! I use the old lemma from spaCy as a fallback instead of the plain token.text which has better results in my experience. By using the lookup before the verb rules we have better results with irregular verbs.

pablodms commented 4 years ago

Hello @Langbraue and @spATLASti

I think your proposal is a great idea and is implemented smartly :) Regarding irregular verbs, I think it would be interesting to allow the end user to extend the default lookup table based on Wicktionary using the command line so they can test and propose more datasources.

Unfortunately, these months I am involved in other projects and do not have time to adequately maintain this one.

Thank you very much for your collaboration.

Langbraue commented 4 years ago

Thanks a lot!

Sebastian Petrausch

Am 11.04.2020 um 11:43 schrieb pablodms notifications@github.com:

 Merged #2 into master.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.