christian-monch commented 3 years ago

The goal is to provide schema.org-compatible Dataset definitions in studyforrest.org.

We assume that all studyforrest datasets are in BIDS-format. BIDS does not contain sufficient information to generate a useful schema.org Dataset record. The missing information has to be added in another way.

If we had time/in a perfect world:

The proposed strategy to generate the Dataset definitions is to use the existing studyminimeta-format to provide the missing information. Studyminimeta comes with a datalad-metalad indexer, that converts studyminimeta-documents into schema.org Dataset-description. To prevent multiple entry of data, we create a tool that creates studyminimeta-documents from BIDS-datasets. The created studyminimeta-documents will then be amended with the missing information, then the indexer will be used to generate schema.org Dataset

Provide a BIDS2studyminimeta extractor
Execute extractor on all studyforrest datasets.
Update missing information in all studyminimeta.yaml files
Use the studyminimeta indexer to generate schema.org Dataset markup
Combine generated markup into a DataCatalog
Add markup to studyforrest.org

Since we are in a hurry:

[x] Use the existing parsers to extract data from dataset_description.json-files (including the existing citation-parser).
[x] Edit the result manually
[x] Put it on studyforrest.org

christian-monch commented 3 years ago

First version of metadata description studyforrest-ld.json.txt (schema.org in JSON-LD).

Should be added to the studyforrest.org page inside the head-element in a script-tag with type "application/ld+json". Example:


<html>
  <head>
    <script type="application/ld+json"> 
     {
       "@context": {
       "@vocab": "http://schema.org/"
     },
     "@graph": [
       {
          "@id": "https://schema.studyforrest.org/studyforrest-data",
          "@type": "DataCatalog",
          "name": "Studyforrest Datasets",
          "accountablePerson": "m.hanke@fz-juelich.de",

       ...

    </script>
  ...
  </head>
<body>
</body>
``

christian-monch commented 3 years ago

@aqw I can add the metadata file, but maybe you would like to do that in order to prevent stepping on each others toes. ;-)

aqw commented 3 years ago

Happy to add this. Should this be on just the main page, or part of the header of all pages on the site?

aqw commented 3 years ago

I was curious if we could link to a file from the header rather than embed it into every page. It seems that is sadly not possible.

It seems that (as of now) JSON-LD is only searched for in <script> elements, However, according to the W3C:

The script element allows authors to include dynamic script and data blocks in their documents. [...]

When used to include dynamic scripts, the scripts may either be embedded inline or may be imported from an external file using the src attribute. [...]

When used to include data blocks (as opposed to scripts), the data must be embedded inline [...]

<link> would be the appropriate tag for non-embedded, static data blocks, but JSON-LD doesn't look for <link>.

So embedded in every single page is the solution for now. Seems like a waste, but it is what it is.

christian-monch commented 3 years ago

PR https://github.com/psychoinformatics-de/studyforrest-www/pull/25 for studyforrest-www has been created

psychoinformatics-de / studyforrest-data

Provide schema.org Dataset-markup for studyforrest datasets #45

If we had time/in a perfect world:

Since we are in a hurry: