gbif / hosted-portals

Support material for establishing the GBIF Hosted Portals
Apache License 2.0
9 stars 6 forks source link

feedback for hosted portal dataset page #262

Closed ymgan closed 10 months ago

ymgan commented 10 months ago

Saw that feedback is welcome, so here I am:

Referring to: https://hp-antarctic.gbif-uat.org/occurrence/search?view=DATASETS

What I like

I find the organization very neat and it is much more readable than the previous overlay. I very much appreciate this! The different tabs of About, Project, Citations, Download are similar to gbif.org so I find it to be quite intuitive! The little info tag is very thoughtful and it is enough to catch the attention of user.

What can be improved

Better wording in the info tag

Screenshot 2023-09-09 at 11 22 13

I expected to see an explanation on why not all records are in hosted portal when I clicked on it, but it leads me to the dataset on GBIF.org. So perhaps something like view full dataset on GBIF.org will make the intention more explicit, or link it to a page explaining why not all records from the dataset is included in the hosted portal.

Differentiate the occurrence count from hosted portal and from full dataset

Screenshot 2023-09-09 at 10 47 52 Screenshot 2023-09-09 at 10 48 01

I was a little confused by the occurrence count. I think it would be helpful to make it more obvious that the count is from full dataset in the drawer. Or make everything consistent with the occurrence count from the hosted portal and link the occurrences to the scoped version in the hosted portal (e.g. https://hp-antarctic.gbif-uat.org/occurrence/search?datasetKey=7d4ed8b7-f31f-4133-a848-6a315ecfe7cc&view=TABLE)?

awkward bounding box for Geographic scope

Screenshot 2023-09-09 at 12 11 36

The bounding box of this dataset is a little awkward: SCAR Biogeographic Atlas of the Southern Ocean - Porifera - Data but it looks fine on IPT

Screenshot 2023-09-09 at 12 11 53

Meaning of colours in charts

Screenshot 2023-09-09 at 10 37 30

When looking at this chart, I am not sure if the colours have any meaning? If there is, perhaps some legend will be helpful!

Differentiate DOI to GBIF and preferred DOI

Screenshot 2023-09-09 at 11 08 31

I think it will be helpful to differentiate preferred identifier and the doi to GBIF because that will help the users to decide to go to GBIF for full dataset (since occurrences in hosted portals are scoped) or to see the dataset homepage hosted elsewhere.

Data richness and Issues and flags present related information

Screenshot 2023-09-09 at 11 25 37

I am thinking whether it would be helpful to position these 2 closer to each other because I feel the information presented by both are related? Also I had a question when I saw Identified to species, not sure whether the count is based on interpreted data (based on taxon match to GBIF Backbone Taxonomy) or was it based on provided data (verbatim)? Nonetheless I think the info from both data richness and issues and flags are useful!

I hope this is helpful! Thank you so much for your hard work Morten!!

MortenHofft commented 10 months ago

Thank you @ymgan

info box text Do you have a suggestion for what the text should be instead?

Differentiate the occurrence count from hosted portal and from full dataset That is what the info box was intended to do.

The counts on the dataset tab shows a list of datasets that contribute data to a given query. Not how many there is on the hosted portal. Though that is the number if you have no filters added. e.g. https://hp-antarctic.gbif-uat.org/occurrence/search?taxonKey=1005624&view=DATASETS When clicking the dataset it will show information about the Dataset, So Biological observations from the Discovery Investigations 1925-1952 will show a larger number than 11, that is just the number of occurrences for Klugeflustra antarctica (26K is included on the site and the dataset total is 38K)

Then there is the question about wether the dataset show be a subset view of the published dataset that match the scope of the website. I think that is a slippery slope that will never be a good solution. For many reasons: The geographic scope will show data in areas that might not be included on your site. Similar for the description, taxonomic scope, temporal scope etc. Citations too will be wrong because they will be citations of records in the dataset that isn't part of the hosted portal site. Downloading the archive will include other counts than what the site then states. And lastly I'm not sure the publisher of the dataset will want 50 different representation of their dataset with different data counts etc.

I think the only feasible solution in the long run is to inform the user about what they are looking at. That is the text you are writing :)

awkward bounding box for Geographic scope Indeed it looks odd. I will replace it with something else at some point. Currently we just use the MapBox static image API, but that has issues as you see. It is just so wonderfully simple to use. A plain html image.

Meaning of colours in charts You are not the only one to comment on that. I should add tooltips. Or remove it. I can tell you what it is, and then you can help decide what to do. This dataset has 38.000 records The most frequent country has 12.000 records, so roughly 30%. That is the dark hue. Page 2 becomes a very boring chart as Angola: 180 wouldn't even show.

Alternatively I can choose to scale the results on a given page so that the top value is a full bar, and everyone else is relative to that. But then each new page start over with what looks like 100%. That is the light hue. We can now see that Angola is 6 times as frequent as Gambia.

What should the tool tips say? Or should we remove one of the colors?

page 1

Screenshot 2023-09-09 at 22 20 23

page 2

Screenshot 2023-09-09 at 22 20 31

DOI I'm not sure I understand actually. Could you expand please? Would you like a more prominent link to the GBIF.org website version? If so I tend to agree.

Chart ordering Why do you feel like these 2 are more related than say data richness and event date? The reason I added it last is because I wanted it to be easy to find, but not starting out with highlighting issues first. It felt impolite. I want the dataset publisher to be happy with how we present their work.

Identified to species It is based on interpreted data. I do not believe I ca tell anything from the verbatim scientific name without interpreting it. We can add a tooltip or such? I think it might also have a bug btw: https://github.com/gbif/gbif-web/issues/339

ymgan commented 10 months ago

info box text

Do you have a suggestion for what the text should be instead?

What do you think about having a general information part (not customizable) and a custom information part that can be customized by each hosted portal? For example

(start of general part) Not all records from the dataset is included on this site. Visit GBIF.org for more information on hosted portals. (end of general part) (start of custom markdown) This portal's scope is detailed in the FAQ. (end of custom markdown)

something like that?

Differentiate the occurrence count from hosted portal and from full dataset

I think the only feasible solution in the long run is to inform the user about what they are looking at. That is the text you are writing :)

You are absolutely right about this!

awkward bounding box for Geographic scope

Indeed it looks odd. I will replace it with something else at some point. Currently we just use the MapBox static image API, but that has issues as you see. It is just so wonderfully simple to use. A plain html image.

I totally understand that! Thank you so much for looking into it!

Meaning of colours in charts

Ah, I understand now~ Thank you very much for your hard work on this! In my humble opinion, the pie chart is more visual for the percentage. I feel it is already hard to point for South Africa which ranked 5th in the dataset, so tooltip will be difficult to use. If it is me, I will prefer to keep only one colour and let pie chart handles the percentage. (I still appreciate your attempt though, thank you!)

DOI

Would you like a more prominent link to the GBIF.org website version? If so I tend to agree.

Yes exactly! Of course you did not understand, because I made a mistake in the previous screenshot 🙈 (sorry about that!) Both DOI of the dataset in the red rectangles point to the dataset home page. It is not obvious to me that the GBIF logo at the bottom means "view dataset on GBIF". So yes please for a more prominent link to the GBIF.org website version!

Chart ordering

Why do you feel like these 2 are more related than say data richness and event date?

I guess I was thinking that if continent and country can be derived from coordinates, it means those records are with coordinates. But I clearly was not thinking straight because datasets with continent and country filled in will not have those flags. So please ignore what I said in the previous comment.

The reason I added it last is because I wanted it to be easy to find, but not starting out with highlighting issues first. It felt impolite. I want the dataset publisher to be happy with how we present their work.

You are absolutely right on this! Thank you for being so considerate, I really appreciate that!

Identified to species

It is based on interpreted data. I do not believe I ca tell anything from the verbatim scientific name without interpreting it. We can add a tooltip or such? I think it might also have a bug btw: https://github.com/gbif/gbif-web/issues/339

Right! Got it! I was thinking differently! I thought it was about the count based on the value of taxonRank == "species" 🙈 Thank you for looking into the possible bug!

MortenHofft commented 10 months ago

Thank you for your detailed comments @ymgan

Issue for charts bars: https://github.com/gbif/gbif-web/issues/345

Link to gbif.org: I've added an explicit link in the Table of contents

Info box: I've updated the text and you can overwrite it by adding a custom message in your site config. I've done so here https://github.com/gbif/hp-antarctic/commit/097e3737a7e0903769ce58b4423c88bfd2d1e2e8 Example: https://hp-antarctic.gbif-staging.org/occurrence/search?datasetKey=96385e98-93b5-4e4a-8081-6e30b738816a&view=DATASETS

Identified to species: you are completely right. That is essentially what it does. It could just as well have been as you say, except I do not have access to the publisher provided info I believe. We can add an explanation, I just need to find a way to do so that doesn't create too much clutter. Having explanations all over the place can be a bit overwhelming, and kind of do the opposite of being helpful. The think is that all of the information is based on interpretation. Same for year, coordinates. If we couldn't process it, then it wouldn't be included in the counts. So really it is a comment on all fields.

Doi: the DOI will not always point to GBIF.org. It depends on how the data was published.

MortenHofft commented 10 months ago

Feel free to open a new issue if needed. I think I have addressed all of above and deployed it to staging. The geographic scope, is still pending a better solution than mapbox