Open gbif-portal opened 12 months ago
This is a question for @thomasstjerne
In theory, we could make an effort to remove them. But I´m in doubt if it is a good idea to filter the source. These names are present and have IDs in the BOLD taxonomy like for example http://bins.boldsystems.org/index.php/Taxbrowser_Taxonpage?taxid=990886 If a user don´t want these names, they can be filtered out by the nametype. On the other hand, if a user want an unfiltered representation of the BOLD taxonomy, they can only get that if we don´t remove any taxon ids
BOLD allows the submitter of the data (=the customer) to enter any name he wants. There are no spelling checks or plausibility checks. These are not species. They are data descriptors chosen by the customer. Sometimes it's a species, sometimes it's a color, or sometimes it's the name of the collector. I reported more than 200 illegitimate "species" names such as "red 1" or "indet. JK" and the BOLD support did nothing as they allow this kind of discrepancies. To be fair most submitters to BOLD make an effort but some don't and there is no handle on those cases.
Odd BOLD names
The BOLD checklist dataset contains some thousand "names" which are neither BOLD identifiers nor scientific names. Are they important or could they maybe be removed from the dataset?
User: See in registry - Send email System: Safari 16.5.2 / Mac OS X 10.15.7 Referer: https://www.gbif.org/species/search?dataset_key=4cec8fef-f129-4966-89b7-4f8439aba058&name_type=NO_NAME&origin=SOURCE&advanced=1 Window size: width 1659 - height 975 API log&_a=(columns:!(_source),filters:!(),index:'3390a910-fcda-11ea-a9ab-4375f2a9d11c',interval:auto,query:(language:kuery,query:''),sort:!())) Site log&_a=(columns:!(_source),filters:!(),index:'5c73f360-fce3-11ea-a9ab-4375f2a9d11c',interval:auto,query:(language:kuery,query:''),sort:!())) System health at time of feedback: OPERATIONAL