gbif / portal-feedback

User feedback for the GBIF API, website and published data. You can ask questions here. 🗨❓
30 stars 16 forks source link

GrSciColl mapping: Bee Biology and Systematics Laboratory #4274

Open MortenHofft opened 1 year ago

MortenHofft commented 1 year ago

Bee Biology and Systematics Laboratory

https://www.gbif.org/dataset/10e44c48-0839-4a20-86d5-f0e23ae2e366

published by USDA-ARS Pollinating Insect-Biology, Management, Systematics Research

560K unmapped

For all records it is the case: Collection code | BBSL Institution code | USDA-ARS

In GrSCiColl we have collection Pollinating Insect-Biology, Management, Systematics Research (code: BBSL) from institution USDA/ARS (code: SWSL)

We also have the institution USDA Agricultural Research Service (ARS) code: USDA-ARS.

I suppose we should either

Which of the two depends on what level they want group their data.

This is a bit difficult to untangle because USDA ARS has so many institutions/collections. I suspect the real world is more complex than our 2-tier model. But it would nice if we could ensure some level of consistency.

We could have institution USDA ARS and everything else as collections. Or we could have no occurrences attached to USDA ARS institution, and instead have multiple institutions with the prefix USDA/ARS which then could have multiple collections of their own. Or some mix? It isn't clear to me how this is best organised. But it looks like it might need some restructuring.

albenson-usgs commented 1 year ago

I noticed this when I was working with another USDA-ARS Bee Lab to review their data. I'm thinking having the institutionCode = USDA-ARS will be the most amenable to all the ARS labs and then have a collection for each under the higher level institution. I don't completely know what to do with all the legacy institutions but that's what I'm going to recommend to any new USDA-ARS datasets coming in.