Open tim5go opened 5 years ago
Interesting. Those are indeed problematic. I've generally been happy with the vast majority of "singularized" words, but I'll add a few that were problems for me:
cross->cros goddess->goddes sadness->sadnes sarcophagus->sarcophagu putti->puttus (should be putto) world war ii->world war ius
Also adding the errors as described above:
business->business virginia->virginium tour->tmy loss->los
Hi, I'm a new contributor to this repo and I'd like to try my hand at solving this.
The documentation states that pattern.text.en.inflect's singularize
function- which seems to be the problem here- has been adapted from this repo: https://github.com/bermi/Python-Inflector. Was it directly taken, or were there changes made? I'm wondering if I should wander over to the inflector to see how the singularize algorithm works, or just look at the one here.
Also I get viruses->viruse - Is there an updated model file or something? Is it solved in 3.6?
The built-in
singularize
function yields lots of false positives:Here're some examples: 1) business 2) virginia 3) tour 4) loss
It ends up I need to define a self-maintained exception dictionary, which is really inconvenient. I know it's hard to cover all cases, but some of the false positives are really trivial. I am quite disappointed given this repo receives lots of stars.