gbif / backbone-feedback

1 stars 0 forks source link

Subphylum Crustacea interpreted as (doubtful) genus Crustacea #395

Open dagendresen opened 3 years ago

dagendresen commented 3 years ago

When publishing a new dataset on plankton the occurrence records with scientificName = subphylum Crustacea are interpreted as a (flagged doubtful) genus Crustacea.

See subphylum Crustacea, urn:lsid:marinespecies.org:taxname:1066 in WORMS. And the taxon record for Crustacea from WORMS in the GBIF backbone, https://www.gbif.org/species/155486131

Screenshot 2021-04-22 at 11 13 03
dagendresen commented 3 years ago

See also https://github.com/gbif/pipelines/issues/217

ManonGros commented 3 years ago

That's probably more a question for @mdoering but from what I understand:

  1. the subphylum Crustacea is not in the GBIF backbone (the only taxon ranks integrated in the backbone are Kingdom, Phylum, Class, Order, Family, Genus, Species, Subspecies and unranked for some OTUs: https://data-blog.gbif.org/post/gbif-backbone-taxonomy/)
  2. So Crustacea is matched to the only thing we have in the backbone: https://www.gbif.org/species/10996236 which is a Hymenoptera genus and is labelled as doubtful because it doesn't have any associated species.

So for now the solution would be to use names for one of the ranks available in the backbone.

dagendresen commented 3 years ago

I understand so that these are nauplii (Crustacean larva) and very hard to identify to a lower taxon rank, thus all knowable taxon information is included with the occurrences.

rukayaj commented 3 years ago

@ManonGros Are you suggesting we go up one level to Phylum?

dagendresen commented 3 years ago

I see that the dataset already includes taxon information phylum = Arthropoda. Do you mean to reduce information value by removing subphylum all together? Would it maybe be possible to explore the erroneous GBIF interpretation, ignoring information such as rank provided, for the occurrences could be explored first?? :-)

Apropos see also the taxon Crustacea Brünnich, 1772 from the Norwegian Species Cehecklist https://www.gbif.org/species/167991571

I believe that many more similar issues will arise when mobilizing more marine datasets ;-)

mdoering commented 3 years ago

Yes, GBIF must be able to understand Crustacea and link to somewhere sensible without the need to dumn down the supplied data. We can maintain a small manual lookup map for important cases that often go wrong like this one.

Would phylum Arthropoda be the best place to link those records to? COL accepts now basically any rank and does not restrict higher taxa to Linnean ranks only. Hopefully in the not too distant future GBIF will allow the same. There are more serious problems on the horizon such as Aves being a subphylum. Vertebrates also do not exist right now in the backbone, again a subphylum. @timrobertson100 maybe we should quickly add dwc:subphylum to april fools? Seems this is a really important one for occurrences.

timrobertson100 commented 3 years ago

maybe we should quickly add dwc:subphylum to april fools

Seems sensible. It will also mean taxa at these ranks will need to be added in the backbone

dagendresen commented 2 years ago

Ping - Norwegian marine data publishers are asking for status on this issue :-)

mdoering commented 2 years ago

Not many new ranks have been added to DwC, dwc:subphylum is not ratified. We will have to deal with this problem in the occurrence matching for now, linking all those records with name=Crustacea and rank=subphylum at least to Arthropoda and not a genus.

I could change the matching to refuse to match to a genus when the given rank was a family or above. That seems like a sensible approach to me, any reservations?

rukayaj commented 2 years ago

Makes sense to me...