bjascob / LemmInflect

A python module for English lemmatization and inflection.
MIT License
258 stars 25 forks source link

Lemma model can select rule for wrong pos type #1

Open bjascob opened 5 years ago

bjascob commented 5 years ago

For the test case 'quilting/NOUN' and 'plastering/NOUN', the words are not in the lemma lookup so OOV rules are called.

getAllLemmasOOV('quilting`, 'NOUN')` returns 'quilt' (it selects rule "ing,,False")
getAllLemmasOOV('plastering`, 'NOUN') returns 'plastering' (it selects rule ",,False")

In the case of 'quilting' the model selects a verb rule. To prevent this consider...

In addition, the model classes include the ending letters to remove. However, similar above, there is nothing to prevent it selecting a "remove ing" rule for a word ending in something else. I'm not aware of this causing issues but it should be investigated when looking into the first issue.