trias-project / daisie-checklist

🇪🇺 DAISIE - Inventory of alien species in Europe
https://trias-project.github.io/daisie-checklist/
MIT License
0 stars 2 forks source link

compare taxonRank info with info from nameparser function #11

Closed LienReyserhove closed 5 years ago

LienReyserhove commented 5 years ago

Information about the taxon rank can be provided in two ways:

  1. By using the information in subtaxon_rank in the taxon core. This relates to the original information in DAISIE. This information should thus only apply for subtaxa
  2. By using the information in rankmarker, provided by the GBIF nameparser

It would be interesting to see the differences between the returns of the GBIF nameparser and the content of the subtaxon_rank field

LienReyserhove commented 5 years ago

This is the result of the comparison: (more interpretation, see next comment)

nameparser_rankmarker taxon_subtaxon_rank records
sp. 10774
infrasp. subsp. 518
sp. hyb. 329
infrasp. var. 141
infrasp. 136
NA hyb. 54
NA 30
morph 23
sp. var. 18
NA agg. 14
morph subsp. 11
cv. hyb. 10
sp. agg. 6
pv. 5
sp. 5
sp. subsp. 5
cv. 4
infrasp. f. sp. 4
var. var. 4
infrasp. x 3
strain subsp. 3
var. subsp. 3
infrasp. f. 2
infrasubsp. 2
var. 2
cv. subsp. 1
cv. var. 1
f. var. 1
infrasp. agg. 1
infrasp. Crous 1
infrasp. subspecies 1
infrasp. var 1
infrasubsp. hyb. 1
morph var. 1
sp. Cytosporina sp. 1
sp. f. sp. 1
sp. sp. 1
subf. var. 1
subsp. 1
subvar. subsp. 1
NA subsp. 1
LienReyserhove commented 5 years ago

I noticed the following:

SO: for about 11000 species, the information provided by the rankmarker should be considered as OK (and even better). So just use the taxon rank information provided by GBIF ?

@peterdesmet , @DavidRoy, @qgroom ?

DavidRoy commented 5 years ago

Yes, I think so. With a project more than 10 years old, we cannot go back and investigate the inevitable errors in the DAISIE database. I think we (you!) do the best job of mapping that can be done in the time available. Thanks for all your efforts!

LienReyserhove commented 5 years ago

Ok, thanks for the response! Closing the issue.

peterdesmet commented 5 years ago

GBIF rankmarker will indeed provide cleaner information than taxon_subtaxon_rank, even if there might be some loss of information.