gbif / hosted-portals

Support material for establishing the GBIF Hosted Portals
Apache License 2.0
10 stars 6 forks source link

Counts and links - it isn't clear where to link users and what numbers to show #184

Open MortenHofft opened 2 years ago

MortenHofft commented 2 years ago

There are many cases where it is unclear what numbers to show and where to link users.

Organism ID On GBIF.org, when there is an organismId for an occurrence, then we provide a link to see all occurrences of this organism.

Screenshot 2021-09-16 at 14 23 38

But on a hosted portal it isn't clear what to do. What number should it show for instance. The eagle has been seen in 4 countries.

EventID Same as above but with eventID. This will be relevant to taxonomically scoped portals as sampling events can span taxa. E.g. https://www.gbif.org/occurrence/1280698310

Occurrence clusters What happens when a record clusters with occurrences that isn't included in the portal scope. This is likely to happen for national and institutional scoped portals. A museum record might well be clustered with a record in another museum. Do we show the cluster? Do we link to GBIF.org?

e.g. from NHM rotterdam is is clustered with records from a Naturalis dataset https://www.gbif.org/occurrence/2570364562/cluster

Publisher pages Many of the participants has expressed the desire to have publisher pages. But publisher activities often trancend national boundaries. How do we decide on the numbers of:

Dataset pages Same as for publisher pages. Many datasets is cross boundaries and cross taxa. Even cross institutions and collections. So far a dataset page on e.g. a the BISON website, what counts and information to show. The dataset isn't just about US (in this example).

MortenHofft commented 2 years ago

Cluster example

@langeveldNMR there are 307 occurrences in your portal that clusters (and at least some does so with records from other datasets). Can you help me thinking through the best solution here please? I see these options for handing 'not-in-scope' occurrences:

Are there other options? Which would you prefer?

MortenHofft commented 2 years ago

Publisher page example

@camiplata and @aduarte-sib you might be able to guide me here?

Short version Do we present publishers with all their activities or do we scope it to "this publisher in the context of our country/network/site"

Specific example An example from the Colombian site. This publisher also publish occurrence records about Europe. On the publisher page we normally show some numbers/counts to give the user a quick overview of the publishers activities. E.g. this many occurrences, this many datasets etc.

BUT, if we want to have publisher pages as part of the hosted portals, then we need to decide what to do with these links and counts. In other words: what does the publisher record look like on the Colombian site that only deals with occurrences in Colombia? What numbers to show for occurrences and datasets when the publisher also publish datasets about other countries.

I can think of these options:

What would the publishers prefer How would the publisher prefer their work is represented? It kind of feels silly to exclude their non-national contributions, wouldn't they prefer to show their full activities (that said GBIF activities is only a subset to begin with).

langeveldNMR commented 2 years ago

@MortenHofft Thanks for pointing this out. The major advantage of the hosted portal for NMR compared to our dataset on GBIF is the clear indication that the records on the hosted portal all are in the NMR collection and the identity of the Natural History Museum Rotterdam is more clearly shown to digital external visitors. Therefore, I would definitely not like to see clustering records from other datasets available on GBIF be added to the results/display on the NMR hosted portal. Please do not do this.

I would prefer to keep them separate on the hosted portal detail page as they are right now (in the cluster tab) https://specimens.hetnatuurhistorisch.nl/data?entity=3064139435&filter=eyJtdXN0Ijp7InRheG9uS2V5IjpbNzgwNjg0OF19fQ%3D%3D&from=0 and have no problems to add a link to GBIF to those clustered records, with the clear indication that those specimens are not from NMR collection; something like: "The following occurrence records on GBIF do not belong to this publisher's collection, but do match the present occurrence record."

I think this solution suits both NMR (where the identity of the hosted portal is not compromised with records from other publishers) and any researchers looking for more data (where they are clearly pointed to similar records from other datasets that may be of use to them).

langeveldNMR commented 2 years ago

Just speaking for NMR, I would like any data on the publisher page on the hosted portals to be confined to only data pertaining to the dataset(s) shared in the portal scope. For NMR that would include counts etc. for https://www.gbif.org/occurrence/search?dataset_key=a307e4d7-1de2-4adc-95d5-a0a8d5f57236 but exclude those for https://www.gbif.org/dataset/6db2a74e-98c5-4be3-ae30-3ec8dc68b0f4 This last dataset contains only human observations and therefore falls outside the NMR hosted portal scope, which should only shown the museum collection holdings.