jupyterlab / jupyterlab-metadata-service

Linked data exploration in JupyterLab.
BSD 3-Clause "New" or "Revised" License
29 stars 16 forks source link

Discussions with schema.org folks #7

Closed ellisonbg closed 5 years ago

ellisonbg commented 5 years ago

@fperez and I had a talk with R. V. Guha who works on schema.org and the data commons. Looks like there is interest in getting a group together to work through adding additional capabilities to the schema.org schema for datasets, notebooks, source code files, etc. Will post more as details emerge.

bollwyvl commented 5 years ago

Very exciting stuff! Yes please do keep us updated here! I'd be happy to participate in whatever discussions move forward, having been down this road a few times!

First, I'd like to point out that as of recent-ish developments in the standards, the notebook format v4 could now be considered "Linked Data Ready": the last roadblocks, our mixed semi-private, semi-junk-draw metadata fields, are now possible to describe for linked data. This means we can fiat a schema-conforming v4 document as having a robust meaning in schema.org sense of meaning, while v5 would require very little modification to support extensible linked data features like string-level internationalization.

It seems like a ComputableDigitalDocument (as a kind of CreativeWork, DigitalDocument) is long overdue to join SpreadsheetDigitalDocument (potentially as an ancestor), with whatever the JSON encoding of a notebook would actually be as NotebookDigitalDocument. Even getting the CreativeWork metadata would already be really handy.

Our kernelspecs are cool and everything, but Linked Data is probably the technology best suited to actually capturing The Environment Problem: we can't just shout MIMETYPES and say it's going to be alright. While this is an nbformat v5 kind of thing, it's probably good to have on the table during discussions.

Of course, schema.org still isn't really "publication grade," for folks like Nature, Science, etc. which prefer W3C standards and incumbent taxonomies like Dublin Core, but sometimes schema.org uses those, so it's a good step forward for making notebooks more discoverable, if not any more computable or citeable.

Hooray!

saulshanabrook commented 5 years ago

Closing this for now, since we support multiple taxonomies.