NCEAS / metacatui

MetacatUI: A client-side web interface for DataONE data repositories
https://nceas.github.io/metacatui
Apache License 2.0
42 stars 27 forks source link

Consider fallback behavior in JSON-LD routines for missing fields #1803

Closed amoeba closed 3 years ago

amoeba commented 3 years ago

Google's crawler will occasionally report issues with our schema.org/Dataset JSON-LD with things like missing names, missing abstracts, etc. Sometimes these are legitimate and sometimes they're bogus. One of the more common issues they detect is missing a description field. We populate this from the abstract field in Solr. I had thought that the vast majority of datasets would have abstracts but it appears this is not the case and that we have hundreds of thousands without.

Given how widespread this is and the fact that @mbjones mentioned that Google might stop indexing or ingesting content once it turns into errors like this. This would imply errors like missing abstracts from one member node might mean other member nodes don't get indexed which is pretty problematic.

I think abstract is the biggest problem we might fix today. When the Solr index document is missing that field for a dataset, we could:

  1. Status-quo: Continue not generating a matching description. Probably not a good idea.
  2. Repeat the title as it's the best other piece of information we reliably have
  3. Generating some helpful text encouraging the user to go to the DataONE landing page

I'm a fan of (3) and think we could put text in like:

No description is available. See https://dataone.org/datasets/doi%3A10.18739%2FA25M62781 for complete metadata about this dataset.

Rendered on Dataset Search, this might look like:

Screen Shot 2021-05-21 at 11 17 28 AM
amoeba commented 3 years ago

Done in de297327daf1793677176f2ec83b2e6fb0c23c60.

When the abstract field is missing, you get a description like:

"description": "No description is available. Visit https://dataone.org/datasets/urn%3Auuid%3A1161a3af-27f0-49ce-be93-515016cb6b75 for complete metadata about this dataset.",

I loosely confirmed by looking at other datasets on Dataset Search that they autolinkify URLs in text so this should result in a hyperlink when viewed on Dataset Search.

@laurenwalker I put this on 2.15.2 since it's ready to go and is a minor tweak.

laurenwalker commented 3 years ago

Thanks Bryce!