gbif / portal-feedback

User feedback for the GBIF API, website and published data. You can ask questions here. 🗨❓
30 stars 16 forks source link

Odd BOLD names #5043

Open gbif-portal opened 12 months ago

gbif-portal commented 12 months ago

Odd BOLD names

The BOLD checklist dataset contains some thousand "names" which are neither BOLD identifiers nor scientific names. Are they important or could they maybe be removed from the dataset?


User: See in registry - Send email System: Safari 16.5.2 / Mac OS X 10.15.7 Referer: https://www.gbif.org/species/search?dataset_key=4cec8fef-f129-4966-89b7-4f8439aba058&name_type=NO_NAME&origin=SOURCE&advanced=1 Window size: width 1659 - height 975 API log&_a=(columns:!(_source),filters:!(),index:'3390a910-fcda-11ea-a9ab-4375f2a9d11c',interval:auto,query:(language:kuery,query:''),sort:!())) Site log&_a=(columns:!(_source),filters:!(),index:'5c73f360-fce3-11ea-a9ab-4375f2a9d11c',interval:auto,query:(language:kuery,query:''),sort:!())) System health at time of feedback: OPERATIONAL

ManonGros commented 12 months ago

This is a question for @thomasstjerne

thomasstjerne commented 12 months ago

In theory, we could make an effort to remove them. But I´m in doubt if it is a good idea to filter the source. These names are present and have IDs in the BOLD taxonomy like for example http://bins.boldsystems.org/index.php/Taxbrowser_Taxonpage?taxid=990886 If a user don´t want these names, they can be filtered out by the nametype. On the other hand, if a user want an unfiltered representation of the BOLD taxonomy, they can only get that if we don´t remove any taxon ids

Andreas-Bio commented 11 months ago

BOLD allows the submitter of the data (=the customer) to enter any name he wants. There are no spelling checks or plausibility checks. These are not species. They are data descriptors chosen by the customer. Sometimes it's a species, sometimes it's a color, or sometimes it's the name of the collector. I reported more than 200 illegitimate "species" names such as "red 1" or "indet. JK" and the BOLD support did nothing as they allow this kind of discrepancies. To be fair most submitters to BOLD make an effort but some don't and there is no handle on those cases.