gbif / checklistbank

GBIF Checklist Bank
Apache License 2.0
31 stars 14 forks source link

Regression for incertae sedis #234

Closed tobiasgf closed 2 years ago

tobiasgf commented 2 years ago

It seems that we have a lot (43K) of occurrences of the zooplankton class "Appendicularia" that does not get innterpreted to the right (any) taxon. https://en.wikipedia.org/wiki/Larvacea

The occurrences seem all to be from marine samples: https://www.gbif.org/occurrence/map?taxon_key=0&advanced=1&verbatim_scientific_name=Appendicularia&occurrence_status=present

And thus the fit with Appendicularia in the above sense would be correct.

{
  "count": 43057,
  "verbatim_kingdom": "null",
  "verbatim_phylum": "null",
  "verbatim_class": "null",
  "verbatim_order": "null",
  "verbatim_family": "null",
  "verbatim_genus": "null",
  "verbatim_species": "null",
  "verbatim_infra": "null",
  "verbatim_rank": "null",
  "verbatim_verbatimRank": "null",
  "verbatim_scientificName": "Appendicularia",
  "verbatim_generic": "null",
  "verbatim_author": "null",
  "current_kingdom": "incertae sedis",
  "current_phylum": "null",
  "current_class": "null",
  "current_order": "null",
  "current_family": "null",
  "current_genus": "null",
  "current_subGenus": "null",
  "current_species": "null",
  "current_scientificName": "incertae sedis",
  "current_acceptedScientificName": "null",
  "current_kingdomKey": 0,
  "current_phylumKey": "null",
  "current_classKey": "null",
  "current_orderKey": "null",
  "current_familyKey": "null",
  "current_genusKey": "null",
  "current_subGenusKey": "null",
  "current_speciesKey": "null",
  "current_taxonKey": 0,
  "current_acceptedTaxonKey": "null",
  "proposed_kingdom": "null",
  "proposed_phylum": "null",
  "proposed_class": "null",
  "proposed_order": "null",
  "proposed_family": "null",
  "proposed_genus": "null",
  "proposed_subGenus": "null",
  "proposed_species": "null",
  "proposed_scientificName": "incertae sedis",
  "proposed_acceptedScientificName": "null",
  "proposed_kingdomKey": "null",
  "proposed_phylumKey": "null",
  "proposed_classKey": "null",
  "proposed_orderKey": "null",
  "proposed_familyKey": "null",
  "proposed_genusKey": "null",
  "proposed_subGenusKey": "null",
  "proposed_speciesKey": "null",
  "proposed_taxonKey": 0,
  "proposed_acceptedTaxonKey12727": "null",
  "_key": 532,
  "changes": {
    "kingdom": true,
    "kingdomKey": true
  },
  "reviewed": false
}
mdoering commented 2 years ago

There is both a plant genus and class in the backbone, so without knowing more taxonomic context (e.g. rank or kingdom) we cannot match it to one of them: http://backbonebuild-vh.gbif.org:9000/species/match?verbose=true&name=Appendicularia

Sth for data management to get in touch with the publisher? Or we need to implement this dataset config proposal I did that also fixes other marine dataset problems: https://github.com/gbif/backbone-feedback/issues/192

Plankton datasets seem to suffer quite a bit from very sparse taxonomic information being published.

mdoering commented 2 years ago

Nothing to fix in the backbone at least, closing