Closed timrobertson100 closed 11 months ago
The lookup cache contains:
hbase(main):001:0> scan 'name_usage_kv', { FILTER => "RowFilter(=, 'substring:Lissotarsus reticulata')" }
ROW COLUMN+CELL
6|||||||||Lissotarsus reticulata Chaudoir, 1842||||| column=v:j, timestamp=1696041217943, value={"synonym":true,"usage":{"key":9355155,"name":"Lissotarsus reticulatus Chaudoir, 1842","rank":"SPECIES"},"acceptedUsage":{"ke
y":7811407,"name":"Platyderus reticulatus (Chaudoir, 1842)","rank":"SPECIES"},"classification":[{"key":1,"name":"Animalia","rank":"KINGDOM"},{"key":54,"name":"Arthropod
a","rank":"PHYLUM"},{"key":216,"name":"Insecta","rank":"CLASS"},{"key":1470,"name":"Coleoptera","rank":"ORDER"},{"key":3792,"name":"Carabidae","rank":"FAMILY"},{"key":3
260555,"name":"Platyderus","rank":"GENUS"},{"key":7811407,"name":"Platyderus reticulatus","rank":"SPECIES"}],"diagnostics":{"matchType":"FUZZY","confidence":99,"status"
:"SYNONYM","lineage":[],"alternatives":[]},"iucnRedListCategory":{"category":"NOT_EVALUATED","code":"NE","scientificName":"Lissotarsus reticulatus Chaudoir, 1842","taxo
nomicStatus":"SYNONYM","acceptedName":"Platyderus reticulatus (Chaudoir, 1842)"},"issues":[]}
1 row(s) in 45.7350 seconds
Formatted for readability:
Date is Saturday, September 30, 2023 2:33:37.943 AM
{
"synonym":true,
"usage":{
"key":9355155,
"name":"Lissotarsus reticulatus Chaudoir, 1842",
"rank":"SPECIES"
},
"acceptedUsage":{
"key":7811407,
"name":"Platyderus reticulatus (Chaudoir, 1842)",
"rank":"SPECIES"
},
"classification":[
{
"key":1,
"name":"Animalia",
"rank":"KINGDOM"
},
{
"key":54,
"name":"Arthropoda",
"rank":"PHYLUM"
},
{
"key":216,
"name":"Insecta",
"rank":"CLASS"
},
{
"key":1470,
"name":"Coleoptera",
"rank":"ORDER"
},
{
"key":3792,
"name":"Carabidae",
"rank":"FAMILY"
},
{
"key":3260555,
"name":"Platyderus",
"rank":"GENUS"
},
{
"key":7811407,
"name":"Platyderus reticulatus",
"rank":"SPECIES"
}
],
"diagnostics":{
"matchType":"FUZZY",
"confidence":99,
"status":"SYNONYM",
"lineage":[
],
"alternatives":[
]
},
"iucnRedListCategory":{
"category":"NOT_EVALUATED",
"code":"NE",
"scientificName":"Lissotarsus reticulatus Chaudoir, 1842",
"taxonomicStatus":"SYNONYM",
"acceptedName":"Platyderus reticulatus (Chaudoir, 1842)"
},
"issues":[
]
}
The lookup appears to have worked, and been cached as expected but wasn't included in the interpreted record. Reprocessing yields the same result.
With @muttcg help, we have diagnosed this, and it's behaving as intended @mdoering
It's dropping into this line
if (usageMatch == null || isEmpty(usageMatch) || checkFuzzy(usageMatch, identification)) {
// "NO_MATCHING_RESULTS". This
// happens when we get an empty response from the WS
addIssue(tr, TAXON_MATCH_NONE);
tr.setUsage(INCERTAE_SEDIS);
tr.setClassification(Collections.singletonList(INCERTAE_SEDIS));
}
The web service is returning a fuzzy match (reticulata vs reticulatus) and as we described in this issue if there are no higher taxa on the record (there aren't in this case) we don't assume a fuzzy match is correct as it made too many mistakes. This record needs a higher taxon added to match.
I don't think we want to change this behavior - agree?
As it happens, this is a narrowly scoped dataset (titled "Coleoptera...") so we could add a default of kingdom = Animalia in the registry which would at least improve this dataset.
Ah, that makes sense. It would be great to understand why that has happened from a user perspective, but yes we should keep it. And for sure add a default classification to the dataset. I see this is done already.
We could add more, but I'll start conservatively
Animalia was enough for this example. but there were records being interpreted as Fungi as well, so I added Animalia / Arthropoda / Insecta and that has put this into a better shape.
This record shows as incertae sedis but the lookup should find the species.
I'll investigate, cc @mdoering