clarin-eric / VirtualCollectionRegistry

Virtual Collection Registry (VCR)
GNU General Public License v3.0
2 stars 0 forks source link

add embedded metadata to HTML to improve reference mangere support #150

Open dietervu opened 3 years ago

dietervu commented 3 years ago

e.g. Zotero now recognizes the VCs with DOIs but not yet the ones with only a handle. Include some simple metadata headers (see VLO for an example) so that reference mangers can automatically detect the author, title, and URL

WillemElbers commented 3 years ago

After discussion with Twan, can you confirm this (https://github.com/clarin-eric/VLO/issues/209) is the VLO functionality you are referring to?

dietervu commented 3 years ago

yes!

WillemElbers commented 3 years ago

References:

First proposal for embedded schema.org metadata on collection detail pages:

<script type="application/ld+json">
/*<![CDATA[*/
{
  "url": "http://localhost:8080/service/virtualcollections/1079",
  "name": "test query",
  "description": "test collection; test collection; test collection; test collection",
  "identifier": [
    "https://doi.org/10.17907/test1079",
    "dummy:identifier-1079"
  ],
  "includedInDataCatalog": {
    "url": "http://localhost:8080",
    "@type": "DataCatalog"
  },
  "creator": [
    {
      "name": "admin1, ",
      "@type": "Person"
    }
  ],
  "hasPart": [
    {
      "url": "http://www.clarin.eu/",
      "name": "CLARIN website",
      "description": "CLARIN main website",
      "@type": "CreativeWork"
    },
    {
      "url": "http://www.google.nl/",
      "name": "Google search engine",
      "description": "Google search engine main page",
      "@type": "CreativeWork"
    },
    {
      "url": "http://www.bing.nl/",
      "name": "Bing search engine",
      "description": "Microsoft bing search engine.",
      "@type": "CreativeWork"
    }
  ],
  "@context": "https://schema.org",
  "@type": "DataSet"
}/*]]>*/
</script>

Question(s):

dietervu commented 3 years ago

Would be good to include a publication date (at very least year), as to enable citation managers to create (creator, year) strings.

WillemElbers commented 3 years ago

Updated with dateCreated, dateModified and datePublished:


<script type="application/ld+json">
--
  | /*<![CDATA[*/
  | {
  | "url": "http://localhost:8080/service/virtualcollections/1079",
  | "name": "test query",
  | "description": "test collection; test collection; test collection; test collection",
  | "identifier": [
  | "https://doi.org/10.17907/test1079",
  | "dummy:identifier-1079"
  | ],
  | "includedInDataCatalog": {
  | "url": "http://localhost:8080",
  | "@type": "DataCatalog"
  | },
  | "creator": [
  | {
  | "name": "admin1, ",
  | "@type": "Person"
  | }
  | ],
  | "hasPart": [
  | {
  | "url": "http://www.clarin.eu/",
  | "name": "CLARIN website",
  | "description": "CLARIN main website",
  | "@type": "CreativeWork"
  | },
  | {
  | "url": "http://www.google.nl/",
  | "name": "Google search engine",
  | "description": "Google search engine main page",
  | "@type": "CreativeWork"
  | },
  | {
  | "url": "http://www.bing.nl/",
  | "name": "Bing search engine",
  | "description": "Microsoft bing search engine.",
  | "@type": "CreativeWork"
  | }
  | ],
  | "dateCreated": "2021-06-18T02:00:00.000+02:00",
  | "dateModified": "2021-06-28T12:39:14.000+02:00",
  | "datePublished": "2021-06-26T12:39:14.000+02:00",
  | "@context": "https://schema.org",
  | "@type": "DataSet"
  | }/*]]>*/
twagoo commented 3 years ago

Not sure if this is important to consider, but Google doesn't seem to know/accept {"@type": "CreativeWork"} in a hasPart property for ingesting data sets - it can only be one of a few specific subtypes. Unfortunately this information is not always available. Not sure if we can solve this in a proper and/or easy way but it's probably good to keep it in mind.

Edit: see spec

twagoo commented 3 years ago

Actually, I think hasPart may technically not be the right property to use except for resources in the collection that can in fact somehow be considered data sets - in which case they should be typed as such. I guess we cannot know to which resources this applies unless we explicitly ask the user. But there are not properties available that are a better fit as far as I can tell.