Closed gkellogg closed 2 years ago
With support, I'll set up a new repo with a template ReSpec document and pr-preview.
I support starting such an initiative.
Quoting @anatoly-scherbakov from https://github.com/w3c/json-ld-syntax/issues/389#issuecomment-1127070677
I would propose the following grounds for the
@
→$
replacement: while JSON is machine readable and writable, it is not very human readable and — especially — writable. YAML is much friendlier in that regard, due to much lower syntactic noise. The replacement of these characters helps to further reduce the said syntactic noise, making the data files therefore faster to type.In general, I'd name manually writable semantic data the main purpose for YAML-LD.
I will be happy to participate in the standardization process if one is to be initiated, and to assist however I can.
I see your point about easing the manual editing of YAML-LD. It is a valid point.
But on the other hand, the principle of least surprise is important. Having YAML-LD behave differently from YAML2JSON + JSON-LD sounds like a very bad idea.
What I guess I could leave with is for "native YAML-LD" processors to accept $-keyworrds in addition to @-keywords, with a flag (named e.g. strict-keywords
) to disable this behaviour (in case someone wants to use $-prefixed names for another purpose.
What I guess I could leave with is for "native YAML-LD" processors to accept $-keyworrds in addition to @-keywords, with a flag (named e.g.
strict-keywords
) to disable this behaviour (in case someone wants to use $-prefixed names for another purpose.
Yes, and I think a similar flag for using $
instead of @
when serializing. Trying to preserve the original form using $
or @
is probably too complicated.
Note, that using the $
"namespace" will overlap with other existing uses, at least from JSON. For example, JSON Schema has $schema
, $vocabulary
and other other keywords that would not otherwise overlap with the JSON-LD keyword namespace.
A formal YAML-LD variant that is a minimum-surprise syntax variant of JSON-LD is a good idea, and I don't want to get in the way of that discussion.
However, I just wanted to link my comment from the other issue:
https://github.com/w3c/json-ld-syntax/issues/389#issuecomment-1127716287
JSON-LD has some limitations when working with certain idiomatic JSON structures. Schema Salad is a schema language for YAML and JSON documents which describes validation and transformation from idiomatic YAML structures to JSON-LD structures and then on to RDF. We have used it extensively for our own use case (describing the Common Workflow Language but I wanted to mention it here to see if there's interest in generalizing it, separately from this YAML-LD discussion.
@pchampin @gkellogg I agree with using @-keywords as a default, unless a flag is supplied to the processor to enable $-keywords. The YAML-LD preprocessor which converts $ to @ might also take care to convert only the keywords which are reserved by JSON-LD. This will also resolve the possible conflict with JSON Schema (I guess the two standards have different sets of keywords).
@tetron Thank you for the information, will have a look to compare Schema Salad with plain YAML-LD that I am currently using.
@tetron There is no dispute that JSON-LD is no schema language and is often complemented with JSON Schema.
JSON-LD has some limitations when working with certain idiomatic JSON structures.
But can you elaborate on this point using examples? I think that JSON-LD 1.1 is very flexible, eg local term definitions, add/ remove auxiliary keys that have no reflection in RDF, etc
@OR13 @nicholascar please vote for YAML-LD CG workgroup above
I'm not sure I have the cycles to help much with YAML-LD, but I think its a good idea.... we use OAS / YAML with JSON-LD and JSON Schema often.
In particular, I like the idea of controlling both semantics and data shape at the same time, using only 1 file.
But can you elaborate on this point using examples? I think that JSON-LD 1.1 is very flexible, eg local term definitions, add/ remove auxiliary keys that have no reflection in RDF, etc
So, schema salad was created in response to JSON-LD 1.0 (and roughly 5 years before JSON-LD 1.1), the main motivations were:
Examples of the last two:
steps:
# this is an identifier map, the key is the id
step1:
in:
# another identifier map,
# the identifier is syntactically scoped so it gets appended to the
# enclosing id
# the value is assigned to a default predicate of "source" since it's a scalar
input_parameter: source_parameter_uri
out: [output_parameter]
This is translated to json-ld 1.0 that looks something like this:
{
"steps": [{
"@id": "step1",
"in": {
"@id": "step1/input_parameter",
"source": "source_parameter_uri"
},
"out": [{
"@id": "step1/output_parameter"
}]
}]
}
We've also implemented code generators for Python and Java. It would probably also be straightforward to write a translator to express the schema as SHACL.
I'll start a "YAML-LD UCRs" issue and include "polyglot modeling"
Having YAML-LD behave differently from YAML2JSON + JSON-LD sounds like a very bad idea.
I fully agree. To avoid confusion I'd vote to
either limit the specification of YAML-LD to half a page, essentially stating that YAML-LD is JSON-LD expressed in YAML, so any standard conversion YAML <-> JSON (without any exceptions such as $
/ @
keys) can do
or to make clear that YAML-LD is not "JSON-LD in YAML" but an independent data format, loosely based on JSON-LD. Even if it's 99% like JSON-LD, you cannot use standard tools but need a specific YAML-LD aware software library to avoid running into edge cases.
Option 1 can be done as part of JSON-LD, option 2 is an independent project with unknown outcome and adoption.
In my opinion, writeability is the primary reason to even have YAML-LD, and it is also my feeling that @
→ $
replacement had greatly improved my own writing experience when authoring YAML-LD data files and contexts.
I believe the -LD
suffix can be generally used to describe different data formats somehow augmented with Linked Data. For instance,
This is not an argument but I am also using Markdown with YAML-LD front matter meta data, and call it Markdown-LD :)
Thus, if to choose from the options @nichtich has proposed, I would vote for (2). I still believe the specification can be half a page to describe the exact logic of the conversion from YAML-LD to JSON-LD and vice versa, but I believe that the potential space of -LD
data formats is vast. Each of them should be equipped with its own tools, even though the meaning of those formats can probably be in most cases derived from JSON-LD 1.1 specification, and we could use JSON-LD contexts to interpret the data files.
I have written up a paper about YAML-LD recently and I would be happy to get feedback from the community, but the conference I submitted the paper to requires double blind review so I am uncertain whether I am at liberty to share the draft. Guess I would have to wait till the organizers' decision about whether they're going to publish it or not.
@nichtich I beleive that your option 2 above would be far too confusing for many people. I would rather make YAML-LD a superset of JSON-LD in YAML, i.e. adding some specific idioms / patterns that "native" YAML-LD processors would understand (e.g. the $-keywords replacement). But basically those specific idioms would be pre-processed for producing valid JSON-LD, which could then be fully compliant.
And I would make it easy to recognize/require YAML-LD documents that are strictly JSON-LD in YAML (i.e. do not require the pre-processing), like a media-type parameter (e.g. text/ld+yaml;profile=strict
).
Folks, please contribute requirements in https://github.com/json-ld/yaml-ld/issues/2
@pchampin @nichtich Sorry, I find your "will be confusing" argument unconvincing. Let me play devil's advocate and apply your argument to other situations:
s p o1, o2
and a
: these are too confusing, let's stick with ntriples"->
for log:infers
and {graph1} -> {graph2}
for inference. Too confusing, let's stick with Turtle". I think @josd and TimBL may have an issue with that ;-)As a data architect, I want to be able to write shortcut notation like this, and get proper RDF (see https://github.com/json-ld/yaml-ld/issues/2 Shortcuts), and even the other way around. This below is shorthand turtle, but I'd like a similar spirit in YAML.
:Person a rdf:Class;
:born a rdfs:Property; <- :Person; -> xsd:date.
Doc1234 :creator ~tobyink .
~tobyink a :Person; :born 1980-01-01 .
`Example-Distribution 0.001 cpan:TOBYINK` issued 2012-06-18 .
I left an example of our use of OAS (Open API Specification) and JSON-LD here: https://github.com/json-ld/yaml-ld/issues/2
TLDR; OAS supports JSON Schema represented in YAML, we tweaked the JSON Schema to support JSON-LD terms, so now we can present RDF types and JSON Schema types in a single YAML file.
Having YAML-LD behave differently from YAML2JSON + JSON-LD sounds like a very bad idea.
I fully agree. To avoid confusion I'd vote to
- either limit the specification of YAML-LD to half a page, essentially stating that YAML-LD is JSON-LD expressed in YAML, so any standard conversion YAML <-> JSON (without any exceptions such as
$
/@
keys) can do- or to make clear that YAML-LD is not "JSON-LD in YAML" but an independent data format, loosely based on JSON-LD. Even if it's 99% like JSON-LD, you cannot use standard tools but need a specific YAML-LD aware software library to avoid running into edge cases.
Option 1 can be done as part of JSON-LD, option 2 is an independent project with unknown outcome and adoption.
JSON-LD 1.1 explicitly chose to use an intermediate representation to allow for other formats to map easily into that representation. While your option 1 can't realistically be done in 1/2 page, IMO any spec should confine itself to this level over interoperability. (This is consistent with allowing @
to be optionally replaced with $
, I believe).
The challenges of getting JSON-LD standardized in the first place should be a cautionary for trying to do someone similar but different, so I don't see option 2 as viable.
I was asked by @VladimirAlexiev to vote on this, so I did, but to be clear I am voting that there should be some official position on how to represent JSON-LD in YAML.
I am equally convinced by the two schools here:
I am personally biased towards 1 as I rarely author contexts directly, instead I autogenerate these from a yaml-native "polyglot" language (LinkML), but I am sympathetic to those who do author the contexts directly.
@VladimirAlexiev regarding your examples: Turtle (respectively N3) is a strict superset of N-Triples (respectively Turtle) so I don't see this as confusing. Shacl-C is a completely different syntax from Shacl in Turtle, so I don't see this a confusing. I agree that the co-existence of JSON-LD and RDF/JSON might be confusing, and notice that the RDF WG decided at the time to recommend only one of them (RDF/JSON is only a note).
@nichtich's proposition 2 is "an independent data format, loosely based on JSON-LD", where "loosely" could mean up to "99%". The closest this YAML-LD is to JSON-LD, the more surprising it will be for users when they hit a difference. Furthermore "you cannot use standard tools but need a specific YAML-LD aware softare" sounds a lot like reinventing the wheel.
I guess what I really don't want is to define something from scratch. If YAML-LD is to be based (even loosely) on JSON-LD, then I would strongly advise that YAML-LD processors are based on a JSON-LD processor under the hood. Experience shows that developing a bug-free JSON-LD processor is tricky. Requiring a similar, but different, effort for YAML-LD, seems like a bad idea.
If YAML-LD is going to be more than a simple application of YAML syntax to express JSON-LD documents (proposition 1), it may help to reduce references to JSON-LD to avoid mixing levels of description (YAML, JSON-LD, RDF...). The specification could be limited to rules how to transform a YAML-LD document into a JSON document (e.g. replace defined use of $
with @
, expand YAML tags...) without detailed knowledge of JSON-LD and RDF.
For instance the specification could explain how to transform these YAML documents
$id: "http://example.org"
---
$id: 1
into JSON documents {"@id": "http://example.org"}
and {"@id": 1}
respectively. The latter is not valid JSON-LD but this would be irrelevant to transformation rules from YAML-LD to JSON.
The final sentence of the specification would be requirement that JSON encoded by YAML-LD transformation rules must be valid JSON-LD. This can (and should) be checked with existing JSON-LD parsers.
@nichtich I'd agree. I tried to formalize this as follows.
A document D is designated as a valid YAML-LD document if, and only if:
$
character, replace $
with @
if and only if the resulting string will be a valid JSON-LD reserved keyword as per its specification.Idea: instead of $-keywords, why not defining a YAML tag for each JSON-LD keyword (i.e. !context
, !type
, ...). These tags would only expect an empty string, so they should be use "on their own", e.g.
!contex t:
!vocab : http:/:example.com/ns/
!id : #test
!type : Foo
bar: baz
PROS: while $context
and other $-keywords could (in theory) be intended as a regular JSON keys, tags have no direct JSON interpretation, so tag-keywords are unambiguous
CONS: from a short experiment, it seems you need a space between the tag and the colon (:
) (while you don't need it with "regular" keys), which is error-prone...
A couple of thoughts on how to specify YAML-LD as an extension of JSON-LD API:
There is certainly enough interest to start an activity for YAML-LD. I'll create and setup a repo for this purpose. I think both the spec and the URC documents can be in the same repo, but PR Preview will only use one of them for nicely formatted PRs.
I'll put out a proposal for a CG call (maybe @pchampin can help with a Zoom setup), which would be useful for more than just YAML-LD.
Repo has be set up at https://github.com/json-ld/yaml-ld. If you would like to contribute, and are a member of the JSON-LD Community Group I can add you to the contributors team. Please create an issue (or respond to an already existing issue) to be added to the team.
Moving this issue to the yaml-ld repo.
@pchampin this is an interesting idea but I'd say that the required space character is a great irregularity introduced to the syntax, and is a potential source of mistakes.
It might be interesting to use YAML tags for something in YAML-LD context, but I have no idea how at present. $type
seems to work fine to assign RDF types to nodes. I do not have any other ideas.
@anatoly-scherbakov
@pchampin this is an interesting idea but I'd say that the required space character is a great irregularity introduced to the syntax, and is a potential source of mistakes.
I agree, unfortunately.
It might be interesting to use YAML tags for something in YAML-LD context, but I have no idea how at present.
I created a dedicated issue for this: #6.
Closing this. If you've made important remarks above, please post them as separate issues. In particular @gkellogg https://github.com/json-ld/yaml-ld/issues/3#issuecomment-1137630441, and maybe @pchampin ?
On the w3c/json-ld-syntax w3c/json-ld-syntax#389 proposes advancing work on YAML-LD. I was not able to transfer the issue to this repository, but further discussion and votes in support of starting the initiative in the JSON-LD CG should be voiced here. Given sufficient support, we'll create a yaml-ld repository for work to procede.