gnames / gnfinder

GNfinder finds scientific names in UTF8 texts, PDF files, MS Word/Excel documents, URLs etc.
MIT License
44 stars 5 forks source link

As a developer I would like to see `sp.nov.` without not return NO_ANNOT #140

Closed mjy closed 1 year ago

mjy commented 1 year ago

Probably an issue with token handling requiring spaces. If we can't handle here I'll pre-propcess the text.

3.2.1 :022 > t = 'Turripria woldai sp.nov'
 => "Turripria woldai sp.nov" 
3.2.1 :023 > t = 'Turripria woldai sp.nov'
 => "Turripria woldai sp.nov" 
3.2.1 :024 > r = Vendor::Gnfinder.result(t, project_id: 13)
 => 
#<Vendor::Gnfinder::Result:0x0000000117aedae0
... 
3.2.1 :025 > r.names.first.found.annotation_nomen_type
 => "NO_ANNOT" 
dimus commented 1 year ago

makes sense, I'll think how to deal with this.

mjy commented 1 year ago

Other combinations that seems to not be detecting:

mjy commented 1 year ago

Other combinations that succeed, somewhat unexpectedly:

dimus commented 1 year ago

Other combinations that seems to not be detecting:

  • Coccidencyrtus pomadus sp. nov..\n (could be wrong on this)

I think this is part of #143