Open Archilegt opened 2 years ago
Maybe "petiolis inermibus" or a spelling variant is producing the false positive.
I think the related output from gnfinder is this one:
{
"cardinality": 2,
"verbatim": "Petrolus inermis,",
"name": "Petrolus inermis",
"oddsLog10": 11.983664170973137,
"oddsDetails": [
{
"feature": "spDict: inSpecies",
"odds": 8904.045433955427
},
{
"feature": "uniDict: inGenus",
"odds": 2976.794090112943
},
{
"feature": "uniEnd3: lus",
"odds": 570.6314549737272
},
{
"feature": "spEnd3: mis",
"odds": 210.6946910672223
},
{
"feature": "spLen: 7",
"odds": 3.6025724692203513
},
{
"feature": "uniLen: 8",
"odds": 0.9606164921956841
},
{
"feature": "abbr: false",
"odds": 0.8732848865715452
},
{
"feature": "priorOdds: true",
"odds": 0.1
}
],
"start": 143,
"end": 160,
"annotationNomenType": "NO_ANNOT",
"verification": {
"id": "0dbc49e2-b393-5d52-a0be-2b09ce6231fa",
"name": "Petrolus inermis",
"cardinality": 2,
"matchType": "PartialExact",
"bestResult": {
"dataSourceId": 181,
"dataSourceTitleShort": "IRMNG",
"curation": "Curated",
"recordId": "urn:lsid:irmng.org:taxname:1391559",
"entryDate": "2022-06-10",
"sortScore": 8.67908829458864,
"matchedName": "Petrolus Rafinesque, 1815",
"matchedCardinality": 1,
"matchedCanonicalSimple": "Petrolus",
"matchedCanonicalFull": "Petrolus",
"currentRecordId": "urn:lsid:irmng.org:taxname:1391559",
"currentName": "Petrolus Rafinesque, 1815",
"currentCardinality": 1,
"currentCanonicalSimple": "Petrolus",
"currentCanonicalFull": "Petrolus",
"isSynonym": false,
"classificationPath": "Biota|Animalia|Chordata|Vertebrata|Reptilia|Reptilia|Reptilia|Petrolus",
"classificationRanks": "|Kingdom|Phylum|Subphylum|Class|Order|Family|Genus",
"classificationIds": "urn:lsid:irmng.org:taxname:1|urn:lsid:irmng.org:taxname:2|urn:lsid:irmng.org:taxname:148|urn:lsid:irmng.org:taxname:11905117|urn:lsid:irmng.org:taxname:1448|urn:lsid:irmng.org:taxname:10544|urn:lsid:irmng.org:taxname:100138|urn:lsid:irmng.org:taxname:1391559",
"editDistance": 0,
"stemEditDistance": 0,
"matchType": "PartialExact",
"scoreDetails": {
"cardinalityScore": 0,
"infraSpecificRankScore": 0,
"fuzzyLessScore": 1,
"curatedDataScore": 0.6666667,
"authorMatchScore": 0.14285715,
"acceptedNameScore": 1,
"parsingQualityScore": 1
}
},
So looks like Pithopus inermis
is not returned from gnfinder.
@mlichtenberg and @cajunjoel can you help to find out how this false positive appeared in BHL?
It was old data left over from a previous name-finding algorithm. I re-ran that page through the latest version of GNFinder (1.0.0) and the data now reflects the GNFinder output shown in the previous comment (https://www.biodiversitylibrary.org/page/663902).
@mlichtenberg, @cajunjoel, taking into account an imminent approach of bhlindex v1.0.0, may be we should plan to run it in October against whole BHL and get rid of outdated inaccuracies of old algorithms?
Recognition of Petrolus is as expected for "Petiolus inermis" sentence in line 5, with underlying uncorrected OCR "Petrolus inermis". There is one less false positive for a centipede name! ;) I will leave the issue open in case that you wish to continue working on it.
Document false positive Pithopus inermis on page https://www.biodiversitylibrary.org/page/663902 The name does not occur on that page. If we figure out what went wrong maybe we could fix it.