Closed jar398 closed 1 year ago
thanks for spotting this @jar398, you are right, both i
s should be removed by stemming
hm actually, I have some doubts now, I did ask zoologists, but I think I also need to ask botanists
we used an example list from https://snowballstem.org/otherapps/schinke/ where names like 'aduersarii` stemmed to 'aduersari'. I am not sure it is right, we probably need to find an alternative algorithm, or make one based on a Latin linguist advise.
both ii should be deleted according zoological and botanical experts, I will change stemming algorithm accordingly.
Thanks!
Maybe this intentional but I ran into this problem and thought I'd ask... I see the following two stemming results:
I would naively expect the entire -ii suffix to be removed when stemming, so that these two epithets can be seen as equivalent.
Another: Sorex bairdi / bairdii . Examples are from MSW3 vs. MDD.
I am running v1.5.2; apologies if this has been fixed already.
Thanks