gbif / checklistbank

GBIF Checklist Bank
Apache License 2.0
31 stars 14 forks source link

Regression for Chordata #158

Closed thomasstjerne closed 3 years ago

thomasstjerne commented 3 years ago

This one should probably not be in Bacteria when the verbatim data says Animalia > Chordata

{
  "count": 1417,
  "verbatim_kingdom": "Animalia",
  "verbatim_phylum": "Chordata",
  "verbatim_class": "null",
  "verbatim_order": "null",
  "verbatim_family": "null",
  "verbatim_genus": "null",
  "verbatim_species": "NA",
  "verbatim_infra": "null",
  "verbatim_rank": "null",
  "verbatim_verbatimRank": "null",
  "verbatim_scientificName": "Tunicata",
  "verbatim_generic": "null",
  "verbatim_author": "null",
  "current_kingdom": "Animalia",
  "current_phylum": "Chordata",
  "current_class": "null",
  "current_order": "null",
  "current_family": "null",
  "current_genus": "null",
  "current_subGenus": "null",
  "current_species": "null",
  "current_scientificName": "Chordata",
  "current_acceptedScientificName": "Chordata",
  "current_kingdomKey": 1,
  "current_phylumKey": 44,
  "current_classKey": "null",
  "current_orderKey": "null",
  "current_familyKey": "null",
  "current_genusKey": "null",
  "current_subGenusKey": "null",
  "current_speciesKey": "null",
  "current_taxonKey": 44,
  "current_acceptedTaxonKey": 44,
  "proposed_kingdom": "Bacteria",
  "proposed_phylum": "Cyanobacteria",
  "proposed_class": "null",
  "proposed_order": "null",
  "proposed_family": "Ilictidae",
  "proposed_genus": "Tunicata",
  "proposed_subGenus": "null",
  "proposed_species": "null",
  "proposed_scientificName": "Tunicata Sidorov, 1969",
  "proposed_acceptedScientificName": "Tunicata Sidorov, 1969",
  "proposed_kingdomKey": 3,
  "proposed_phylumKey": 68,
  "proposed_classKey": "null",
  "proposed_orderKey": "null",
  "proposed_familyKey": 11289671,
  "proposed_genusKey": 11616033,
  "proposed_subGenusKey": "null",
  "proposed_speciesKey": "null",
  "proposed_taxonKey": 11616033,
  "proposed_acceptedTaxonKey": 11616033,
  "_key": 10598,
  "changes": {
    "kingdom": true,
    "kingdomKey": true,
    "phylum": true,
    "phylumKey": true,
    "family": true,
    "familyKey": true,
    "genus": true,
    "genusKey": true,
    "scientificName": true,
    "acceptedScientificName": true,
    "taxonKey": true
  }
}
mdoering commented 3 years ago

Thats a matching problem, not really a backbone issue. The verbatim data is rather weak and there is an exact match for Tunicata.

I agree that Bacteria and Animalia is worlds apart and should probably not lead to a match. Sth to improve the occurrence matching - hopefully thereby not breaking sth else in this delicate balance.

mdoering commented 3 years ago

If a rank like "class" would be given it would not match Bacteria btw