gbif / portal-feedback

User feedback for the GBIF API, website and published data. You can ask questions here. 🗨❓
30 stars 16 forks source link

Improve representation of hosting institutions #1382

Open ahahn-gbif opened 6 years ago

ahahn-gbif commented 6 years ago

The role of data host was introduced to allow proper credit to both the owner ("publisher" in the sense of the GBIF registry: an institution registered as a data publisher and endorsed by a GBIF Participant node) and the institution supporting publication ("host" - likewise an endorsed GBIF publisher, providing hosting services to other institutions). In contrast to those publishers who host their own datasets, though, publishers who (almost exclusively) act as hosts for the datasets of other registered publishers still have relatively limited visibility. The publisher page exists, but only lists a very few numbers (dataset count with filter link, publisher count, countries) and appears largely empty. Example: https://www.gbif.org/publisher/a86e9e36-12ec-49a4-a94c-c0c981fffb71.

To give appropriate credit to institutions that act as hosts, and to motivate and support hosts in their important function, the following improvements to the page are considered:

  1. an occurrence map including occurrences from hosted datasets (hostingKey)
  2. a list of the publishers supported by the host [a list of datasets already exists, and should remain]
  3. an occurrence search by hostingKey, and consequently
  4. counts and metrics for hosted data (min. starting point: occurrence count)
  5. for checklist datasets: counts and possibly list of taxa

NB: some hosts have a mixed model, publishing own datasets as well as hosting datasets for other institutions. Both types of data need to be included.

myrmoteras commented 6 years ago

just to make sure: "Publisher" = journal article publisher "host"= data publisher = data publisher sensu GBIF. image

myrmoteras commented 6 years ago

is there a possibility to to add "counts and or lists of the taxon submitted by the host?"

ahahn-gbif commented 6 years ago

Sorry, that was confusing - I added a few more words. No, not quite: "Publisher" = publishing organization sensu GBIF, i.d. a registered organization, endorsed by a GBIF Participant node, that can publish dataset through the network "Host" = likewise a GBIF-registered and endorsed organization, hosting datasets for own or other organizations. For most registered organizations, the two roles of "publisher" and "host" coincide - they host their own datasets. In that case, the role of the host is not specially emphasized. In cases where the publisher/owner of the dataset and the host are different organizations, the hosting role of the organization is made explicit.

In the model discussed, the Journal will be represented as a data publisher, which already contains the full representation (map, counts, metrics, list of datasets etc). The part that needs attention is the improved representation of Plazi (and others) as a host.

ahahn-gbif commented 6 years ago

Yes, we should include checklist datasets as well. Thanks for reminding!

myrmoteras commented 6 years ago

@ahahn-gbif what will be the implication for the upload of new DWCA for the host, i.e. Guido?

ahahn-gbif commented 6 years ago

Re-linking the existing entries is likely best done from here, but we would need the GBIF publishers to exist before that is possible.