Open wadid opened 11 months ago
Hi @wadid , this is still something in the works. For context, I will be using spaCy's neural edit-tree lemmatizer for this. I am not sure what my timeline would be, perhaps late December. If you're in a rush, I suggest training your own lemmatizer for now.
Another option is to lemmatize in a rules-based approach. However, that might require more research to the exact lemmatization rules for Tagalog.
Do you know this project? https://github.com/crlwingen/TagalogStemmerPython Accuracy rate of 94,12%. How good is that?
Hi thanks for this, I think a 94.12% accuracy should be decent given that Tagalog lemmatization rules can be complicated given the agglutinative nature of the language. Right now, I'm trying to port both into calamanCy (rules-based using that stemmer and a neural-based one using spaCy's edit-tree lemmatization).
Hi, is there something like a lemmatizer? I have a couple of tagalog sentences with translations and I am trying to lemmatize them (then do some sorting by frequency and then use it myself for language learning ;))