Closed anatoly-scherbakov closed 2 years ago
I think we need to analyze the potential for a YAML-LD file including some well-known context which defines $
equivalent for @
keywords. This would work with the existing algorithms and allows for all keywords other than @context
to use a $
form. It also round-trips through compaction.
We could create a standard context at, say https://www.w3.org/ns/json-ld/yaml-ld.jsonld
containing these keyword definitions which a YAML-LD source file could reference. This doesn't work for the expanded form, but I don't see that as an issue.
Note that many contexts already provide bare-word equivalents for keywords, such as id
=> @id
, type
=> @type
, so including such a context might not be necessary.
We could create a standard context at, say
https://www.w3.org/ns/json-ld/yaml-ld.jsonld
containing these keyword definitions which a YAML-LD source file could reference. This doesn't work for the expanded form, but I don't see that as an issue.
Yes, I thought of that. It would work... except for @context
, which can not be aliased (https://www.w3.org/TR/json-ld11/#aliasing-keywords).
Yes indeed, this is quite an interesting idea.
This is a lot of language change just to avoid some quotations marks. It's come up before in the JavaScript community (and others) which frequently get annoyed by having to write doc['@context']
in JS vs. the "ideal" doc.@context
(which isn't valid, of course).
I'd think that @gkellogg's approach of providing an alias mapping, would provide a good work around for 99% of the keywords, and @context
's uniqueness would continue to (as it has now for years) provide an in-document "signal" that one is dealing with JSON-LD (just as $schema
does for JSON Schema).
That seems the simplest approach with the fewest possible confusions and/or conflicts with other communities who may use $
as a prefix.
I think this "convenience context" as a best practice is a great idea! I have experienced the pain of writing quotations in YAML (and JavaScript) myself and this still enables staying compatible enough.
I would like to raise a question though: Is the $
character the best choise? It may very well be, but I think it would make sense to look at possible options for a while as it might potentially save hassle in the future. E.g. it might make sense to be sure to support as many language quirks as possible, similar to avoiding the doc['@context']
vs. doc.@context
issue in JavaScript mentioned by @BigBlueHat.
Do you think alternative characters is a topic worth discussing or is $
a clear choice?
Here is a list of special ASCII characters, some of which are already reserved for special purposes: ! " # $ % & ' * + , - . / : ; < = > ? @ \ ^ _ | ~
My experience in interoperability suggest to be very cautious with clever solutions that can be easily replaced by IDE's features.
Anyway, if you are interested in following that path, @juusoautiosalo 's question is correct: pick another character instead of "$" to avoid overlaps with JSON Schema.
My current opinion is this:
@context
keyword cannot be overridden this way, we shall have to just accept that);@id
is mapped to id
, or they can invent their own. In fact they should invent their own mappings suitable for their domain.This will make the JSON-LD → YAML-LD → JSON-LD round-trip workable and reduce the complexity of the whole system because we delegate it to existing JSON-LD mechanism.
€
and £
are valid alternative to $
to me :P $
is to be avoided for clashes with other specs.My 2¢
- proponents should identify a different char for that:
€
and£
are valid alternative to$
to me :P
$
is a part of the most standard ASCII layout. This is not a very good argument since I had mentioned myself that the user might use any mapping they want; perhaps they want to map @id
to 🧸 in their context, why not? Looks cute.
But still — perhaps would be a good thing in the standard Convenience Context to stick to ASCII.
- I won't recommend this practice since it paves the way to security issues, but environments are different and I can just talk for mine. Surely
$
is to be avoided for clashes with other specs.
Could you illustrate with an example use case where $
can be harmful? This could greatly contribute to the discussion.
I provided some reasoning behind character choice at https://github.com/json-ld/yaml-ld/issues/55
@anatoly-scherbakov
The most interoperable way to avoid quoting "@
" is to engage with YAML community until "@
" is un-reserved.
Another solution is to implement a feature in IDE YAML plugins that automatically adds quotes when writing "@
".
JSON-LD can be processed after the conversion to RDF: this means that the context is capable of modifying how a message is processed https://github.com/web-payments/web-payments.org/issues/21
For example, a malicious agent could exploit some leaky checks and be able to replace @id
thus creating confusion in an entry. IMHO this approach is then not advised when you want to enforce syntactic validation (e.g. via WAF or similar tool). In a closed ecosystem, people that does not have security constraints and does not need to interoperate with external entities may still like it.
{
"@context": {
"£id": "@id"
},
"@id": "http://example.org/foo",
"£id": "http://example.org/foo", <- a legitimate JSON that breaks when adding the mapping context
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
}
One great benefit of YAML is that it does not allow duplicate keys (which is a long standing (security) issue for JSON). This approach re-introduces duplicate keys under a different form: an organization should probably make a security assessment before making such a decision.
Let's call $-LD-files
the ones using $-keywords. When you need to interoperate in the web you might have issues with:
$
to avoid conflicts with JSON LD @keywords
) - this can be mitigated using £ ;Those actors' files will not be JSON-interoperable with $LD files and will need to normalize them before using.
The $-context (or £-context) is defined to be specific to YAML-LD, but will spill-over to JSON-LD because of all implementations that use "dumb" JSON/YAML libraries to process these files.
@ioggstream -- Please edit your https://github.com/json-ld/yaml-ld/issues/11#issuecomment-1206308330 and put a code fence around @id
(whether or not you then retain the double-quote wrapper), so that GitHub user doesn't continue to get pings on every update to this issue, in which they are probably not interested.
- we can provide a default context that maps
@
to$
or some other character;
Why not just map to plain words (i.e., @id
=> id
). That is the practice followed by many other contexts. Maybe there should be a couple of suggested contexts for authors to use.
The most interoperable way to avoid quoting "
@
" is to engage with YAML community until "@
" is un-reserved. Another solution is to implement a feature in IDE YAML plugins that automatically adds quotes when writing "@
".
Although I think it would be great for the YAML community to consider this, it will be a while before it is widely deployed, so I don't think we can rely on any change to YAML to deal with our @
keywords.
JSON-LD can be processed after the conversion to RDF: this means that the context is capable of modifying how a message is processed web-payments/web-payments.org#21
JSON-LD has the concept of Protected Term Definitions and restrictions on how scoped contexts can be applied specifically to deal with this issue. Alongside Proofs/Signatures, these help maintain the integrity of JSON-LD documents. YAML-LD benefits from this practice.
- ok that using a specific context is the way to go if an author wants to do it
- proponents should identify a different char for that:
€
and£
are valid alternative to$
to me :P- I won't recommend this practice since it paves the way to security issues, but environments are different and I can just talk for mine. Surely
$
is to be avoided for clashes with other specs.
I'm generally opposed to introducing $
keyword alternatives, other than be defining them within a context document. I think it's getting out too far in front of a problem that may not really exist and introduces general complication to the processing model (and interoperability) that is not worth the change.
@ioggstream -- Please edit your #11 (comment) and put a code fence around
@id
(whether or not you then retain the double-quote wrapper), so that GitHub user doesn't continue to get pings on every update to this issue, in which they are probably not interested.
I took care of it again. Everyone should try to remember this when adding comments. (Note, I found that COMMAND-e when highlighting some texts does this easily on the Mac).
$ is an ASCII character, british pound and euro characters are not.
If not to restrict ourselves with ASCII then why 🔸id
is worse than €id
and £id
? I do not believe it is.
I do not insist on including the convenience context into the specification: I will be able to use it anyway :)
@anatoly-scherbakov --
I know how to type $
(shift-4 on any US or ASCII keyboard).
I can quickly locate £
(opt-3), €
(shift-opt-2), ¢
(shift-opt-4), and various others on my Mac, and those key-chords would become rote quickly enough.
Your suggested "small orange diamond" 🔸
does not appear to be available through a simple key-chord, and while GitHub finds it via :small_orange_diamond:
it requires a minimum of :small_o
to differentiate it from all other available Unicode/emoji characters. I doubt it's always even as easily accessed as that.
To my mind, that makes 🔸
substantially worse (read: less convenient) than €
or £
.
Why not just map to plain words (i.e.,
@id => id
). That is the practice followed by many other contexts. Maybe there should be a couple of suggested contexts for authors to use.
Having a context specific for YAML file impacts on interoperability.
The fact that schema.org already maps id
-> @id
means that the $-context will somewhat collide with schema.org context...
introducing $ keyword alternatives, ..introduces general complication to the processing model (and interoperability) that is not worth the change
:+1:
JSON-LD has the concept of Protected Term Definitions and restrictions on how scoped contexts can be applied specifically to deal with this issue
Agree, but I'd happily avoid testing buggy implementations :)
@ioggstream -- Please note that GitHub doesn't always preserve code fences in quoted lines, and you always need to review copied-and-pasted text ... so (i.e., @id => id)
in the first line of the first quote in your latest comment, https://github.com/json-ld/yaml-ld/issues/11#issuecomment-1206863165, includes no code fence, and that user is getting pinged again.
@TallTed done, thanks. It seemed such a good idea to register the @id
nick :P
Discussed in the TPAC F2F and resolved to close as won't fix.
As an
author of YAML-LD files
… WHO I want anability to type keywords without quotes
… WHAT So thatmy authoring experience is better
… WHYMotivation
I believe the primary purpose of having a Linked Data format based on YAML is to simplify manual authoring of the linked data documents. This means that, in an information system, we could ask domain experts to write YAML documents to describe their knowledge.
YAML is much easier to write manually than JSON because it does not require as much syntactic noise. Normally, keys can be written without quoting at all:
However, sooner or later the document author will have to define
@type
,@context
,@language
, or any other JSON-LD keyword; and then they have to remember that@
is a reserved character and that in such cases quoting is mandatory.The potential author of YAML-LD documents is not necessarily a programmer; they might be a history student, an anthropologist, a biologist, a physicist.
Let's not make their life harder than it has to be.
Potential risks
@
character might suddenly acquire a new meaning under YAML specification. — This is not a risk because, under my proposal, we'll not have@
in our documents anyway, we'll have$
.$
character might suddenly acquire a new meaning under YAML specification. — I consider that unlikely: it is now not, and such a change would bear great risks for broken backwards compatibility, especially for those who combine YAML with JSON Schema.@schema
or@ref
, and JSON Schema does not have$context
or$type
. I am saying this based on a brief web research, please do correct me if I am wrong.Possible implementations
!yaml-ld
tag proposed at #6: interesting, but for a non-programmer it will be just a nuisance, a piece of syntactic noise. I would propose to prioritize it to make YAML-LD as concise as possible without losing its writeability and readability.Proposal
Let us replace
$
→@
and vice versa only for the particular keywords. For instance,will be converted into
because
@context
is a JSON-LD keyword and@schema
is not.Thus, we will minimize the possibilities for conflict while still getting rid of the nasty quotes.