GlobalNamesArchitecture / gnindex

MIT License
0 stars 0 forks source link

As a User I want partial match finds also contain results with genus and the lowest infraspecies. #67

Open dimus opened 5 years ago

dimus commented 5 years ago

@dimus commented on Fri Oct 05 2018

Currently we cut word by word from canonical binomial, trinomial, tetranomial until we find something.

In addition to that it will be useful to return results if we find combination of first word of multinomial name (genus) and the very last epithet in canonical. Such addition seem to be beneficial for both zoologists and botanists.

alexander-myltsev commented 5 years ago

@dimus , please verify the algorithm:

dimus commented 5 years ago

If there is only fuzzy match for genus -- we return - no match

dimus commented 5 years ago

if there are fuzzy matches for 2 or more words then we return it. And additionally genus and the lowest infraspecies fuzzy matches if any

If we found fuzzy match, we return that, and do not go to partial match at all

dimus commented 5 years ago

The algorithm looks like this to me.

  1. If we found any other match -- we do not do partial match at all
  2. For partial match we remove a word and try to match (exact and fuzzy) the rest. if it does not work, we remove one more word and try to match (exact and fuzzy) the rest. If only one word left, we only do exact match.
  3. In addition we remove everything in the middle and try to match genus and last word (exact and fuzzy). If we did get result, we return it together with result from 2 (if any)