w3c / json-ld-syntax

JSON-LD 1.1 Specification
https://w3c.github.io/json-ld-syntax/
Other
112 stars 22 forks source link

YAML-LD? #389

Closed VladimirAlexiev closed 2 years ago

VladimirAlexiev commented 2 years ago

I thought I knew JSON-LD.

But then I saw this DOAP example at https://github.com/common-workflow-language/common-workflow-language/wiki/Related-ontologies. Compare to an actual Turtle of a Debian package: https://packages.qa.debian.org/b/bowtie.ttl

"@context":
  "foaf": "http://xmlns.com/foaf/0.1/"
  "doap": "http://usefulinc.com/ns/doap"
  "adms": "http://purl.org/adms/"
  "admssw": "http://purl.org/adms/sw/"

adms:Asset
  admssw:SoftwareProject
    doap:name: "STAR"
    doap:description: >
      Aligns RNA-seq reads to a reference genome using uncompressed suffix arrays.
      STAR has a potential for accurately aligning long (several kilobases) reads that are
      emerging from the third-generation sequencing technologies.
    doap:homepage: "https://github.com/alexdobin/STAR"
    doap:repository:
      - doap:GitRepository:
        doap:location: "https://github.com/alexdobin/STAR.git"
    doap:release:
      - doap:revision: "2.5.0a"
    doap:license: "GPL"
    doap:category: "commandline tool"
    doap:programming-language: "C++"
    foaf:Organization:
      - foaf:name: "Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA"
      - foaf:name: "2Pacific Biosciences, Menlo Park, CA, USA"
    foaf:publications:
      - foaf:title: "(Dobin et al., 2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics."
        foaf:homepage: "http://www.ncbi.nlm.nih.gov/pubmed/23104886"
    doap:developer:
      - foaf:Person:
        foaf:name: "Alexander Dobin"
        foaf:mbox: "mailto:dobin at cshl.edu"
        foaf:fundedBy: "This work was funded by NHGRI (NIH) grant U54HG004557"
  adms:AssetDistribution
    doap:name: "STAR.cwl"
    doap:specification: "http://common-workflow-language.github.io/draft-3/"
    doap:release: "cwl:draft-3.dev2"
    doap:homepage: "https://github.com/common-workflow-language/workflows/blob/master/tools/STAR.cwl"

And I'm like WHAT MAGIC is this?

@stain @mr-c can you shed some light?

Googled YAML-LD and only saw a brief discussion again by the CWL people: https://lists.w3.org/Archives/Public/public-linked-json/2015Jan/0035.html

The JSON-LD spec says

Although not discussed in this specification, parallel work using YAML could be used to map into the internal representation, allowing the JSON-LD 1.1 API to operate as if the source was a JSON document.

Despite no official support in the spec, the gazillion JSON-LD conformance tests are written in YAML: https://github.com/w3c/json-ld-syntax/tree/main/yaml .

I know that YAML can be trivially converted to JSON and thereon to JSONLD. But as per the above discussion, it would be nice to get rid of the need to write those pesky @. @gkellogg is there some convention for that, or the CWL example is not widely adopted?

pchampin commented 2 years ago

I concur that this example from CWL is really strange, in many respects:

It looks like some part of the context was forgotten. For example, adding

      "adms:Asset": {@container: @type}

turns admssw:SoftwareProject and adms:AssetDistribution into the types of the corresponding nodes, which seems to make sense. But the empty keys for types are a mystery to me.

mr-c commented 2 years ago

Hello all. That snippet is not part of the CWL recommended practices (the CWL standards recommend using schema.org annotations). Looks like @portah authored it so I'll leave that to him to explain

The CWL standards themselves are defined in the schema-salad language, created by @tetron in 2015 to be a sort of YAML-LD https://github.com/common-workflow-language/schema_salad/

anatoly-scherbakov commented 2 years ago

I have been using something that I am calling YAML-LD, for a while.

In this example, I am defining rdfs:range and rdfs:domain properties for a handful of terms from an external ontology.

$context:
  skill: https://schema.jsonldresume.org/#
  schema: https://schema.org/
  rdfs: http://www.w3.org/2000/01/rdf-schema#
  ←:
    $id: rdfs:domain
    $type: $id
  →:
    $id: rdfs:range
    $type: $id
  links:
    $id: skills:links
    $container: $id
links:
  skill:assesses:
    ←: skill:Award
    →: schema:Text
  skill:award:
    ←: skill:Resume
    →: skill:Award
  ...

Purpose: I use YAML-LD to write data files, which then are used to build a semantic graph, which then is used to build pages of a static website. Example: source files directoryresulting page.

The code is essentially a plugin for MkDocs site generator, is open source, and available at github.

VladimirAlexiev commented 2 years ago

@msporny @gkellogg @pchampin @mr-c @anatoly-scherbakov @ericprud

Are there enough people who'd like to work out a standardization of YAML-LD?

YAML is nicer than JSON, yet more powerful. I think that standardization of YAML-LD can be very useful.

Also cc @andimou , @pheyvaer re YARRML.

Manu and Gregg and P-A what's the best way to start this?

gkellogg commented 2 years ago

There is certainly space to do it within the JSON-LD CG, and there has been some interest over time, but not enough to get anyone to work on it seriously, if you'd like to do so, then the best place for issues and to make progress would be in the json-ld.org repository.

Ultimately, these would need to be in their own repository (perhaps https://github.com/json-ld/yaml-ld) which can use the various infrastructure for rendering specifications and hosting test suites.

Note that the premise most of us have been working against is the basic YAML serialization of JSON-LD, and as you've likely noted, the -syntax and -api repos have all of the examples in the spec automatically transformed into hypothetical yaml-ld. But, there are a number of YAML-specific details that likely need to be attended to.

Note that the JSON-LD specs generally call for parsing into an intermediate representation. I don't think we contemplated replacing @ for $ in keywords, and I'm not sure this would achieve consensus, but it could reasonably be contained to a YAML-LD parsing layer.

If you like, I can move this issue to json-ld.org, and we can poll to see if there's enough interest to work on this in the CG. If so, I can facilitate creating the repository, and that repo would be a good place for discussions, along with public-linked-json@w3.org.

Edit: replaced the erroneous https://github.com/json-ld/csv-ld with https://github.com/json-ld/yaml-ld.

anatoly-scherbakov commented 2 years ago

I would propose the following grounds for the @$ replacement: while JSON is machine readable and writable, it is not very human readable and — especially — writable. YAML is much friendlier in that regard, due to much lower syntactic noise. The replacement of these characters helps to further reduce the said syntactic noise, making the data files therefore faster to type.

In general, I'd name manually writable semantic data the main purpose for YAML-LD.

I will be happy to participate in the standardization process if one is to be initiated, and to assist however I can.

VladimirAlexiev commented 2 years ago

@gkellogg thanks for your support! Yes, please move the issue and post a poll. I can help in the standardization process.

need to be in their own repository (perhaps https://github.com/json-ld/csv-ld)

Is that a lapsus lingua and you meant yaml-ld?

More importantly, is there a csv-ld separate from the very excellent CSVW?

tetron commented 2 years ago

@VladimirAlexiev I made a couple updates to the original page to fix syntax errors:

https://github.com/common-workflow-language/common-workflow-language/wiki/Related-ontologies

Here's some more information about Schema Salad:

https://www.commonwl.org/v1.2/SchemaSalad.html

JSON-LD is a W3C standard providing a way to describe how to interpret a JSON document as Linked Data by means of a "context". JSON-LD provides a powerful solution for representing object references and namespaces in JSON based on standard web URIs, but is not itself a schema language. Without a schema providing a well defined structure, it is difficult to process an arbitrary JSON-LD document as idiomatic JSON because there are many ways to express the same data that are logically equivalent but structurally distinct.

Several schema languages exist for describing and validating JSON data, such as the Apache Avro data serialization system, however none understand linked data. As a result, to fully take advantage of JSON-LD to build the next generation of linked data applications, one must maintain separate JSON schema, JSON-LD context, RDF schema, and human documentation, despite significant overlap of content and obvious need for these documents to stay synchronized.

Schema Salad is designed to address this gap. It provides a schema language and processing rules for describing structured JSON content permitting URI resolution and strict document validation. The schema language supports linked data through annotations that describe the linked data interpretation of the content, enables generation of JSON-LD context and RDF schema, and production of RDF triples by applying the JSON-LD context. The schema language also provides for robust support of inline documentation.

Schema Salad is more complex than a 1:1 mapping of JSON-LD elements to YAML, but might be useful to people.

gkellogg commented 2 years ago

@gkellogg thanks for your support! Yes, please move the issue and post a poll. I can help in the standardization process.

need to be in their own repository (perhaps https://github.com/json-ld/csv-ld)

Is that a lapsus lingua and you meant yaml-ld?

More importantly, is there a csv-ld separate from the very excellent CSVW?

Yes, indeed. CSV-LD was an early proposal for what became CSV on the Web. We wrote up something about it here a long time ago. I do think that CSVW needs to be revisited, and there is a nascent CG for it, but it hasn't amassed enough critical mass.

gkellogg commented 2 years ago

I wasn't able to transfer this issue, but created https://github.com/json-ld/yaml-ld/issues/3 for further discussion. Please redirect future discussion there, and provide support for creating new CG work.

gkellogg commented 2 years ago

A repository has been created to push forward work in the CG: https://github.com/json-ld/yaml-ld.