gbif / checklistbank

GBIF Checklist Bank
Apache License 2.0
31 stars 14 forks source link

Regression for Hierochloe R.Br. #209

Open ahahn-gbif opened 2 years ago

ahahn-gbif commented 2 years ago

This seems wrong. Hierochloe R.Br. should remain a grass (Poaceae)

{
  "count": 2698,
  "verbatim_kingdom": "Plantae",
  "verbatim_phylum": "null",
  "verbatim_class": "null",
  "verbatim_order": "null",
  "verbatim_family": "null",
  "verbatim_genus": "null",
  "verbatim_species": "null",
  "verbatim_infra": "null",
  "verbatim_rank": "aggregate",
  "verbatim_verbatimRank": "null",
  "verbatim_scientificName": "Hierochloë odorata -ryhmä",
  "verbatim_generic": "null",
  "verbatim_author": "null",
  "current_kingdom": "Plantae",
  "current_phylum": "Tracheophyta",
  "current_class": "Liliopsida",
  "current_order": "Poales",
  "current_family": "Poaceae",
  "current_genus": "Anthoxanthum",
  "current_subGenus": "null",
  "current_species": "null",
  "current_scientificName": "Hierochloe R.Br.",
  "current_acceptedScientificName": "Anthoxanthum L.",
  "current_kingdomKey": 6,
  "current_phylumKey": 7707728,
  "current_classKey": 196,
  "current_orderKey": 1369,
  "current_familyKey": 3073,
  "current_genusKey": 2705971,
  "current_subGenusKey": "null",
  "current_speciesKey": "null",
  "current_taxonKey": 2703327,
  "current_acceptedTaxonKey": 2705971,
  "proposed_kingdom": "Plantae",
  "proposed_phylum": "Tracheophyta",
  "proposed_class": "Magnoliopsida",
  "proposed_order": "Gentianales",
  "proposed_family": "Rubiaceae",
  "proposed_genus": "Hierochloe",
  "proposed_subGenus": "null",
  "proposed_species": "null",
  "proposed_scientificName": "Hierochloe",
  "proposed_acceptedScientificName": "Hierochloe",
  "proposed_kingdomKey": 6,
  "proposed_phylumKey": 7707728,
  "proposed_classKey": 220,
  "proposed_orderKey": 412,
  "proposed_familyKey": 8798,
  "proposed_genusKey": 2703327,
  "proposed_subGenusKey": "null",
  "proposed_speciesKey": "null",
  "proposed_taxonKey": 2703327,
  "proposed_acceptedTaxonKey1725": 2703327,
  "_key": 6590,
  "changes": {
    "class": true,
    "classKey": true,
    "order": true,
    "orderKey": true,
    "family": true,
    "familyKey": true,
    "genus": true,
    "genusKey": true,
    "scientificName": true,
    "acceptedScientificName": true
  },
  "reviewed": false
}
ahahn-gbif commented 1 year ago

This still applies in 07/2023, though with far fewer records concerned:

{
  "count": 136,
  "verbatim_kingdom": "plantae",
  "verbatim_phylum": "null",
  "verbatim_class": "null",
  "verbatim_order": "null",
  "verbatim_family": "null",
  "verbatim_genus": "null",
  "verbatim_species": "null",
  "verbatim_infra": "null",
  "verbatim_rank": "null",
  "verbatim_verbatimRank": "null",
  "verbatim_scientificName": "Hierochloe odorata agg.",
  "verbatim_generic": "null",
  "verbatim_author": "null",
  "current_kingdom": "Plantae",
  "current_phylum": "Tracheophyta",
  "current_class": "Liliopsida",
  "current_order": "Poales",
  "current_family": "Poaceae",
  "current_genus": "Anthoxanthum",
  "current_subGenus": "null",
  "current_species": "Anthoxanthum nitens",
  "current_scientificName": "Hierochloe odorata (L.) P.Beauv.",
  "current_acceptedScientificName": "Anthoxanthum nitens (Weber) Y.Schouten & Veldkamp",
  "current_kingdomKey": 6,
  "current_phylumKey": 7707728,
  "current_classKey": 196,
  "current_orderKey": 1369,
  "current_familyKey": 3073,
  "current_genusKey": 2705971,
  "current_subGenusKey": "null",
  "current_speciesKey": 2703346,
  "current_taxonKey": 2703336,
  "current_acceptedTaxonKey": 2703346,
  "proposed_kingdom": "Plantae",
  "proposed_phylum": "Tracheophyta",
  "proposed_class": "Magnoliopsida",
  "proposed_order": "Gentianales",
  "proposed_family": "Rubiaceae",
  "proposed_genus": "Hierochloe",
  "proposed_subGenus": "null",
  "proposed_species": "null",
  "proposed_scientificName": "Hierochloe",
  "proposed_acceptedScientificName": "Hierochloe",
  "proposed_kingdomKey": 6,
  "proposed_phylumKey": 7707728,
  "proposed_classKey": 220,
  "proposed_orderKey": 412,
  "proposed_familyKey": 8798,
  "proposed_genusKey": 2703327,
  "proposed_subGenusKey": "null",
  "proposed_speciesKey": "null",
  "proposed_taxonKey": 2703327,
  "proposed_acceptedTaxonKey248": 2703327,
  "_key": 43019,
  "changes": {
    "class": true,
    "classKey": true,
    "order": true,
    "orderKey": true,
    "family": true,
    "familyKey": true,
    "genus": true,
    "genusKey": true,
    "species": true,
    "speciesKey": true,
    "scientificName": true,
    "acceptedScientificName": true,
    "taxonKey": true
  },
  "reviewed": false
}

The issue here is likely that due to the "agg." qualification, matching is done at genus level, which brings in a generic homonym that would not win at species level. Better fixed at occurrence level mapping (add family)

mdoering commented 1 year ago

Indeed. We should probably use the species matching to find out the right genus. Odd though to have a true homonym being accepted, I raise this to COL.

The grass one is being conserved: https://www.ipni.org/n/18212-1 over a pre Linnean grass genus: https://www.ipni.org/n/18211-1

But nothing about Rubiaceae. Actually our current backbone treats the genus as Rubiaceae: https://www.gbif.org/species/2703327

mdoering commented 1 year ago

... which happens because of a Plazi classification, which in turn uses older COL data. These days at least it is considered a grass by COL: https://www.catalogueoflife.org/data/taxon/67GRF

@myrmoteras @gsautter this is an example of feeding ourselves with circular dependencies. I would really prefer to see the original source classification in Plazi data.