ResearchObject / ro-crate

Research Object Crate
https://w3id.org/ro/crate/
Apache License 2.0
79 stars 34 forks source link

Use Case: integration/contrasting with GEMD format by Citrine team #190

Open sgbaird opened 2 years ago

sgbaird commented 2 years ago

As a materials informatics PhD student, I want a way of storing experimental data so that it is easily integrated into machine learning workflows.

See the GEMD docs (I am unaffiliated). Also, I'm curious how the functionality and scope compares/contrasts with the GEMD format. For example, the idea behind intent vs. realization of research data.

Does there seem to be potential for a converter between the two? Or are the two scopes sufficiently different enough that a converter seems impractical/infeasible?

ptsefton commented 2 years ago

You might be interested in this work I did on RO-Crate export of LabArchives notebooks.

https://github.com/UTS-eResearch/labarchives-to-ro-crate

This was about exporting an entire notebook and all its data into crate. It did some tricks to include the content of pages in the crate so you could more or less read the notebook.

There is an example online here: https://data.research.uts.edu.au/examples/ro-crate/examples/src/notebooks/dataset_for_open_source_malaria_example/data/ro-crate-preview.html#

HEre's an example of a notebook page: https://data.research.uts.edu.au/examples/ro-crate/examples/src/notebooks/dataset_for_open_source_malaria_example/data/ro-crate-preview.html#osm/misc_purifications/aew_300_1

Each part of each notebook page is an Article:

{
      "@id": "#Mzc1OS42fDE1MDkvMjg5Mi9FbnRyeVBhcnQvODc2NjU2ODAwfDk1NDMuNg==",
      "@type": [
        "Article"
      ],
      "articleBody": "<p>⬇️🏷️ Download: <a href='osm/the_p_ochf2_core/egt_23_2/Mzc1OS42fDE1MDkvMjg5Mi9FbnRyeVBhcnQvODc2NjU2ODAwfDk1NDMuNg==/scheme_EGT_23-1.cdx'>scheme_EGT_23-1.cdx</a></p>\n",
      "dateCreated": "2017-04-13T14:42:49Z",
      "contributor": "Edwin Tse",
      "version": "1",
      "description": "file uploaded by Edwin Tse at 2017-04-13T14:42:49Z"
    },

And the pages are all of type ["Dataset", "Article"]

(The complete RO-Crate metadata is here - it's an older version of RO-Crate: https://data.research.uts.edu.au/examples/ro-crate/examples/src/notebooks/dataset_for_open_source_malaria_example/data/ro-crate-metadata.jsonld)

I now longer work for UTS where we did this work but I was hoping to create an RO-Crate profile that would serve as an interchange format for Lab Notebook systems.