bjascob / LemmInflect

A python module for English lemmatization and inflection.
MIT License
258 stars 25 forks source link

Incorrect base form for "install" #7

Closed bgoldowsky closed 3 years ago

bgoldowsky commented 3 years ago

Is it appropriate to report single words that are incorrect as bugs? I realize the dictionary can't be 100% complete.

Lemminflect 0.2.1: getAllLemmas('install') {'VERB': ('install', 'instal')}

I think it should just return 'install'.

bjascob commented 3 years ago

Interesting. Per Google... Instal is chiefly British variant of install. Merriam Webster defines that. They both mean same (and are pronounced in the same way)

Even the "wamerican" dictionary I used has both spellings, even though I've never seen this spelled with one "l".

For the cases where there is more than one spelling returned in the tuple, the first should be the most common. Part of developing the system was to use a corpus to count word occurrences so when multiple spellings exist, they could be ordered based on how commonly they occur.

bgoldowsky commented 3 years ago

Ah, that makes sense. It's legit then, I just never had seen that spelling before. Thank you for the explanation!

bjascob commented 3 years ago

Closing because this looks like an error but apparently the alternate spelling is acceptable.