Open djtfmartin opened 5 months ago
For now, I've changed to only use inferred ranks from the parsed name for binomials and trinomials. This has fixed a number of matches (over 200) in the test set.
For example in the issue, the response now looks like this:
"usageKey": 222,
"scientificName": "Holothuroidea",
"canonicalName": "Holothuroidea",
"rank": "CLASS",
"status": "ACCEPTED",
"confidence": 94,
"note": "Similarity: name=100; authorship=0; classification=-2; rank=0; status=1; score=99; nextMatch=5"
A lot of the changes in rankSimilarity are to do with the GBIF nameparser Rank enum have more entries (116 vs 75).
This affects the rankSimilarity
score as the ordinal
is used to calculate the difference.
Hence this has affect rankSimilarity
scores.
Inferred rank from name parser is causing differences with the
rankSimilarity
measure used in in creating theconfidence
numeric value in the ported API.The net result of this is differences in the confidence value associated with matches, which in turn creates differences between the ported API (
matching-ws
) and the current GBIF API.As an example,
Holothuroidea
is inferred to be a superfamily from the structure of the name.Current GBIF API:
Ported API - which infers superfamily, doesnt match due to low confidence value of 65 (threshold=80)