w3c / json-ld-syntax

JSON-LD 1.1 Specification
https://w3c.github.io/json-ld-syntax/
Other
112 stars 22 forks source link

Unexpected @type scope behavior (term definitions persist throughout JSON tree) #174

Closed dlongley closed 5 years ago

dlongley commented 5 years ago

I was trying to use type-scoped contexts to define a @context and was surprised to discover that any type-scoped terms that get defined in the active context continue to be defined beyond the object with the matching @type. I think this is very unexpected behavior from an OOP modeling perspective. Also, it is very problematic for @protected terms, as it means that you can't model objects of one type that contain objects of another type when there is a commonly used JSON key (that may or may not have the same term definition) when terms are protected.

A playground example:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "ex:",
    "@protected": true,
    "Library": {
      "@context": {
        "book": "library:book",
        "name": "library:name"
      }
    },
    "Person": {
      "@context": {
        "name": "person:name"
      }
    }
  },
  "@id": "the:library",
  "@type": "Library",
  "book": {
    "@id": "the:book",
    "about": {
      "@id": "the:person",
      "@type": "Person",
      "name": "Oliver Twist",
      "book": "unexpectedly defined as library:book!"
    }
  }
}

Produces these quads:

<the:book> <ex:about> <the:person> .
<the:library> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <ex:Library> .
<the:library> <library:book> <the:book> .
<the:person> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <ex:Person> .
<the:person> <library:book> "unexpectedly defined as library:book!" .
<the:person> <person:name> "Oliver Twist" .

http://tinyurl.com/y2x4szzb

If you use @protected here, you get an error (which I also find unexpected):

{
  "@context": {
    "@version": 1.1,
    "@vocab": "ex:",
    "@protected": true,
    "Library": {
      "@context": {
        "@protected": true,
        "book": "library:book",
        "name": "library:name"
      }
    },
    "Person": {
      "@context": {
        "@protected": true,
        "name": "person:name"
      }
    }
  },
  "@id": "the:library",
  "@type": "Library",
  "book": {
    "@id": "the:book",
    "about": {
      "@id": "the:person",
      "@type": "Person",
      "name": "Oliver Twist",
      "book": "unexpectedly defined as library:book!"
    }
  }
}

That error happens even if name is defined the same way for both types.

I suspect that type-scoped terms behave this way because it was easy to implement, but I think it is very surprising behavior that may not have been exposed yet due to limited examples.

It's possible that there's an easy fix for this. I think we should change this behavior so that we track whether a term definition in the active context was defined via a type-scoped context and whether or not it replaced a non-type-scoped term when it did so. Then, whenever traversing into one of the typed object's properties during processing, we revert all type-scoped terms to their previous definitions which may mean setting them to null (clearing them) if they were previously undefined. Then processing can continue as normal.

dlongley commented 5 years ago

Also, if this fix works, I think the JSON-LD syntax spec should clarify that the changes to the active context that bring in type-scoped terms only apply for terms used on the object with the matching type.

dlongley commented 5 years ago

I've implemented the fix in a PR to jsonld.js here: https://github.com/digitalbazaar/jsonld.js/pull/312 and fixed tests and added two more in PR https://github.com/w3c/json-ld-api/pull/89.

Note that if you define a @type scoped context that has property terms with their own scoped contexts, those will still be properly applied to deeply nested nodes within a type. This fix will only ensure that terms that are defined for objects with specific @type values won't leak to other nodes that don't have those types.

gkellogg commented 5 years ago

Why is the expectation that type-scoped contexts are limited to the object containing the type different than for property-scoped contexts being limited to the object value of the property?

I haven't looked at your PR, but it would seem that the expansion algorithm needs to maintain two different contexts, that it received (with possible update from property scoping), and those that come from @type. When an embedded context is encountered, it needs to update both the type-scoped copy and the passed in copy. This also needs to be reflected when handling nested properties.

It does dilute the message that property- and type-scoped contexts be have exactly as if they had appeared inline, as the type-scoped context would disappear when going deeper, while the property-scoped and any directly scoped contexts persist.

Also, what happens when a type-scoped contexts defines a term with an scoped context which is then used? As the algorithm is defined, the expansion algorithm won't see that scoped context, as it's not defined in the current context.

dlongley commented 5 years ago

@gkellogg,

Why is the expectation that type-scoped contexts are limited to the object containing the type different than for property-scoped contexts being limited to the object value of the property?

I expect the primary audience for @type-scoped properties to be people that are using OOP modeling. This means defining a type and the properties you expect to see on that type. The "scope" is the object with a matching @type. If you move beyond that scope (into another object of another @type), it's quite unusual for the terms to be defined. This becomes even more obvious as you move into some deeply nested structure that has a variety of other typed objects along the way.

The primary audience for property-term scoped properties is one that is defining properties for different sections of their JSON tree. If you traverse into branch X of the document, then terms A, B, and C will be defined. This is also intuitive for the audience. I think having to redefine them when you're on the same branch (though you've gone deeper into it) would be quite unexpected. This is different from the @type situation because you change the @type scope when move deeper into a JSON branch (because @type itself doesn't persist), whereas the branch does persist, you're just further along the branch.

I haven't looked at your PR, but it would seem that the expansion algorithm needs to maintain two different contexts, that it received (with possible update from property scoping), and those that come from @type. When an embedded context is encountered, it needs to update both the type-scoped copy and the passed in copy. This also needs to be reflected when handling nested properties.

You don't need to maintain two different contexts, you create a new active context (a clone that removes the @type-scoped terms) when you recurse into the typed object (when you follow its properties to other objects).

Also, what happens when a type-scoped contexts defines a term with an scoped context which is then used? As the algorithm is defined, the expansion algorithm won't see that scoped context, as it's not defined in the current context.

I have a test for this and it is seen. In that case, a term scoped context is created prior to recursing into the object (there is no change to the existing algorithm). Since it is a property-term-scoped context, it functions as expected (defining terms anywhere along the tree branch).

dlongley commented 5 years ago

With the above changes, I was able to update the VC context to use type-scoped contexts with @protected terms:

https://raw.githubusercontent.com/dlongley/vc-data-model/flatten-context/contexts/credentials/v1

gkellogg commented 5 years ago

Okay, that looks like a good approach. I'll work on my own implementation.

iherman commented 5 years ago

This issue was discussed in a meeting.

iherman commented 5 years ago

This issue was discussed in a meeting.

Descends commented 5 years ago

urmmmm.... im sorry what is this?

On Fri, 31 May 2019 at 18:14, Ivan Herman notifications@github.com wrote:

This issue was discussed in a meeting https://www.w3.org/2018/json-ld-wg/Meetings/Minutes/2019/2019-05-31-json-ld#section3-1 .

  • RESOLVED: Un-defer #108 with propogation as the use case

View the transcript 3.1. Type scoped context continued; property wildcard Rob Sanderson: link - #174 https://github.com/w3c/json-ld-syntax/issues/174 Rob Sanderson: what is the difference between type scope contexts and property scoped contexts. Is it scoped to the properties of that class, and other thought of it as a replacement for an inline context, and would then expand beyond that class. … where we came to last week is that there are good use cases for both, but the only way to allow for both use cases is to have type scoped contexts be class-only, and to have a way to expand beyond them by setting a default context within. … is that sufficiently detailed to explain where we are right now? Gregg Kellogg: I didn’t quite understand until right now. I’m trying to think of the syntax Dave Longley: my understanding is that what we’re looking for is to take this other context and define it within this scoped context, and then use it for all properties within that scoped ontext … We want to be able to reuse existing contexts within a type-scoped context, so we don’t have to be verbose typing out all of those contexts again. … syntactically, we can currently do this by re-writing all contexts within each of those properties, but that’s verbose. Rob Sanderson: Example use case: https://preview.iiif.io/api/image-prezi-rc2/api/presentation/3/context.json type scopes in http://www.w3.org/ns/anno.jsonld for Annotation and AnnotationPage Ivan Herman: so, if I want to have all schema properties valid within that type-scoped property, and to inherit, and do it by including the schema context file, not each property inline. Rob Sanderson: an example: we’re using type scoping within annotations to pull in the annotation context, which is a 1.0 context, and since the decision is that the annotations referred to would no longer inherit, and so this would need to be modified with a new keyword to maintain this behavior instead of retyping each property for each context Ivan Herman: so we want hasBody to remain an annotation? Rob Sanderson: we want the resource that is pointed to by that property to be an annotation, even though that annotation context is only valid on that class Gregg Kellogg: I understood that this could be for specific properties, but I thought of wildcard as applying to all properties … for instance, if you’re traversing to FOAF, you might not want to continue to use schema.org properties … syntax and wildcard: we could use full wildcarding or or something like a URI prefix … but then what happens when they have contexts defined? I presume they’re honored as well … how deeply have we thought about the various cases … and would it be a property of the propriety term definition, or a propriety of the class that the class term definition that then defines those terms? Rob Sanderson: we had not talked about globbing or real wildcarding: we’d talked about a shorthand for not retyping all properties within that context. … you would then need to define all schema.org contexts for every class that below need to apply … the question is at what level does the wildcard apply? Is it at the ontology level, or is it at the context level? … we’d talked about it at the context level, which is consistent with how other things work Gregg Kellogg: expanding treats properties as terms, not expanded URIs, and compacting we select terms by matching, not via URI. Enumerating properties by terms, not URIs, is more consistent with how we do things currently Rob Sanderson: some solution that says, for all the terms within this context, treat them as property-scoped within this class … like what dlongley put in the chat: for all properties, treat them as property-scoped contexts. … which then wouldn’t need actual wildcarding, just matching … which seems easier David Newbury: I’m wondering if this doesn’t suggest that @type https://github.com/type scoping itself could be clearer and provide the approach to inheritance that people are expecting here Rob Sanderson: could we just have two keywords, one for each behavior? Dave Longley: I don’t know if it’s exactly the same, because comparability differs here. … when we pull them in, we treat them all as if they’re property-scoped terms, which is different than the behavior before. Dave Longley: +1 to something along the lines of what gregg is saying Gregg Kellogg: I think that if we have a property that can appear in a type-scoped context that says that all terms within that context inherit that context, or perhaps enumerated terms inherit, and in the absence, no terms inherit, and then it could not appear only on type-scoped contexts Dave Longley: I think that we’re thinking that each one of these contexts would then consider the type scoping as if it were defined on all descending properties Gregg Kellogg: and it would be recursive–this would then travel down the property chain Dave Longley: yes Gregg Kellogg: unless that property redefines its own scope … that seems reasonable Rob Sanderson: can we see a straw person example? Gregg Kellogg: @inheritPropertyScopes: true Gregg Kellogg: @inheritTypeScopes: [‘a’, ‘b’] Gregg Kellogg: do those terms need to be defined within that scope, or do they just need to have been in scope at the time it’s interpreted? Rob Sanderson: that would not work for our use case, since the properties of the annotation are not known higher-up the chain Dave Longley: processing: do you see if it appears up higher to see….(lost the chain here) Gregg Kellogg: I think your use case would be solved by using true Rob Sanderson: correct. Dave Longley: when defining a term within a type-scoped context, look for @inheritPropertyScopes Dave Longley: and if that appears, add a property-scoped context to the term definition Dave Longley: (unless one already appears, as that one would take precedence) Gregg Kellogg: we should come up with a better name Rob Sanderson: in our case, at the high level, our use case is… Rob Sanderson: { 'Annotation': {"@id https://github.com/id": "oa:Annotation", "@inheritPropertyScopes": true, "@context https://github.com/context": "http:...anno.jsonld"}

Pierre-Antoine Champin: @propagates ? David Newbury: @Descends ? Rob Sanderson: we can then just update the 1.1 context Benjamin Young: This is pretty ugly, but I think we can make it prettier. Do we use that case anywhere, and you will really need to understand the plumbing to make this understandable. Gregg Kellogg: @propagates +1 Benjamin Young: we’re really going to need a primer. Dave Longley: @propagate : true|[terms] seems ok Benjamin Young: the more we can reduce that cognitive pain…we need something other than reading the spec to explain how this works. Rob Sanderson: there seems to be consensus around @propagate ? Proposed resolution: Create a new keyword, @propagate , for type scoped contexts which takes either a boolean (false is default) or an array of terms, which when present means that all or the listed terms propagate the context listed as the value of the keyword (Rob Sanderson) Dave Longley: @propagate “propagates” the type-scoped context as a property-scoped context for all listed terms Gregg Kellogg: we could consider context as an array, and the first item would be @propagate true. This is getting hacky…we’re pulling on a thread and we can’t stop pulling … I’m less in favor of this than making it a property of the context itself. … if it can’t work except this way… … I think this changes the default… … and if you want the next one to be false… … how do you inherit the default again? … these questions are why I’m not happy with these. Rob Sanderson: This could be solved with metadata on the context, but we’ve deferred that conversation Gregg Kellogg: how problematic is it to just refer to it in the context? Rob Sanderson: it means that we can’t include 1.0 contexts, which is not great. Gregg Kellogg: you can still refer to them… Rob Sanderson: for type-scoped contexts, if you want to refer to a 1.0 context, if you want to type-scope them in, you’d need to rebuild those contexts when @Property is a property of the context, instead of the referring context Ivan Herman: Red flag: we were wondering about feature freeze, and we are discussing something here that is not thought through yet, and it’s a long discussion, and it’s practically June … I am worried here. Protected took two months, and we’re approaching the same place. Rob Sanderson: the issue is that Verifiable Credentials have assumed one way, and the spec works the other way, so there needs to be a decision one way or the other … hopefully a solution that works for both. … we can stick with the spec Gregg Kellogg: we can do type scope as committed, and without dealing with propagation, or we can remove the type-scoped property… Rob Sanderson: but that chooses one use case over the other … we need to deal with the competing use cases … or revert back to the previous spec Dave Longley: it doesn’t make the previous use case impossible, just verbose. … the other way around was literally impossible Rob Sanderson: consider schema.org, you’d need to enumerate all terms in schema on each property. It’s possible, but implausible. … a property on the 1.1 context with propagation, and define a 1.1 context, and @propagates : true David Newbury: does this mean that the writer of this context … decides whether it propagates up or down? … wouldn’t that mean the annotations group would need to define two different versions of that context? Rob Sanderson: yes. that is indeed the case … which also seems…not ideal Gregg Kellogg: I think the way to handle this is to set @propagate changes the default to subsequent properties … we could including contexts judicious… Rob Sanderson: the ugly version of a list where there are processing flags and contexts within the context definition … documentable, but not pretty … and order dependent David Newbury: do we have a sense of which of these inheritance models is more common? … at this point it feels like we’ve built in the ability to turn this on or off … or is that not correct? Rob Sanderson: I don’t think that we know … currently, all of the inheritance models are propagate. 1.0, everything does so. … that implies that propagation is more common, but people coming from object-oriented might think otherwise Pierre-Antoine Champin: I’m not convinced by this, but…I don’t think this has been considered. … another keyword for non-propagating contexts? … remove the flag, make it cleaner Rob Sanderson: that does seem cleaner Ruben Taelman: I like the idea, but that might make context even more complicated, but now have two ways to find a context … is feasible, but complicated Pierre-Antoine Champin: Just to be clear, I share that concern. … two keywords for contexts ugly Dave Longley: it could be a keyword on the type definition instead David Newbury: .. and I wanted to point out that considering rob’s example, having @context always propagate, and a separate keyword for dlongley’s proposal Gregg Kellogg: the other thing, considering contexts with metadata, where we had metadata, and that could solve this … then we could set some of these properties… Rob Sanderson: two routes: new keyword, context reference metadata Benjamin Young: 1.0 propagates now, so the default is propagate true. Then what we need is the way to prevent that, and to say that this is exclusive Rob Sanderson: I would be fine with that Ivan Herman: here is the issue where this was discussed: #108 with a syntax possibility at: #108 (comment) … there’s a syntax proposal there Benjamin Young: I see it differently, type-scoped contexts didn’t exist in 1.0 and are a new concept … and scoping “type-scoped contexts” to types makes perfect sense. Ivan Herman: nobody seemed happy at the time with metadata at the time…if this is the only one we define, it allows others…I would not propose integrity now Dave Longley: +1 to providing a future hook Proposed resolution: Un-defer #108 with propogation as the use case (Rob Sanderson) Rob Sanderson: +1 David Newbury: +1 Gregg Kellogg: +1 Tim Cole: +1 Dave Longley: +1 Ruben Taelman: +1 Harold Solbrig: +1 Ivan Herman: +1 Adam Soroka: +1 Pierre-Antoine Champin: +1 Benjamin Young: +1 (with concerns about scope creep) David I. Lehn: +1 Resolution #2 : Un-defer #108 with propogation as the use case Rob Sanderson: we should then look at 108 over the week and come up with a proposal for contexts Gregg Kellogg: it might be good if this were done through more detailed proposals in advance Rob Sanderson: so, everyone who’s not on a trip, please contribute to the issue … and it is the top of the hour — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub , or mute the thread .
gkellogg commented 5 years ago

@Descends Sorry, you were likely tagged because of an @decends in the meeting minutes, which should have been escaped. It is a possibility to use as a keyword, which happens to be the same as your user name.

Descends commented 5 years ago

No problem, just avoid any other emails please :)

On Fri, 31 May 2019 at 7:12 pm, Gregg Kellogg notifications@github.com wrote:

@Descends https://github.com/Descends Sorry, you were likely tagged because of an @decends in the meeting minutes, which should have been escaped. It is a possibility to use as a keyword, which happens to be the same as your user name.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/json-ld-syntax/issues/174?email_source=notifications&email_token=AJSSGARHGEVCZFNWDBHYHU3PYFTATA5CNFSM4HLSUN22YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWV7OTA#issuecomment-497809228, or mute the thread https://github.com/notifications/unsubscribe-auth/AJSSGAQATINJ2YNVVMVTXBDPYFTATANCNFSM4HLSUN2Q .

gkellogg commented 5 years ago

API updated to fix this in https://github.com/w3c/json-ld-api/pull/89.

gkellogg commented 5 years ago

195 was reviewed by @pchampin and @gkellogg. w3c/json-ld-api#89 by @dlongley and @gkellogg. Closing.

iherman commented 5 years ago

This issue was discussed in a meeting.

azaroth42 commented 5 years ago

Proposal:

Allow the value of @context to be a dictionary that includes exactly two (defined) member properties, @src and @progagates.

Allow a new keyword @propagates within a context root node and within a class definition.

When @propagates is encountered at the root node of a context document, then all classes that are defined within the context are treated as if they had the keyword assigned to the supplied value. [in the same way as @protected works]

When @propagates is encountered within a class definition, and it is set to true, then this counteracts the rule described in 4.1.7 as

A context scoped on @type is only in effect for the node object on which the type is used; the previous in-scope contexts are placed back into effect when traversing into another node object.

And instead means that when that class is encountered in a type scoped environment, the current context still propagates, as it would have if @context were set in the instance data.

Context Examples:

{
  "Annotation": {
    "@id": "wa:Annotation",
    "@context": {
      "@src": "http://www.w3.org/ns/anno.jsonld",
      "@propagates": true
    },
    "label": {"@id": "rdfs:label", "@container": ["@language", "@set"]}
  }
}

The Annotation context should be imported in a scoped way within instances of Annotations. The resources referenced in the JSON tree from that annotation should continue to inherit the definitions of the context, instead of the changes being scoped solely to the Annotation instance. This functionality allows 1.1 contexts importing 1.0 contexts to require that the propagation model of 1.0 is respected.

{
  "Annotation": {
    "@id": "wa:Annotation",
    "@propagates": true
  }
}

If this class is encountered as part of a type scoped context, then the definitions continue to propagate to the resources referenced in the JSON tree below it. This allows 1.1 contexts to continue to use the 1.0 propagation model, as if the @context reference were inline within the instance data, rather than as imported within the context definition. Defining it per class allows some classes to behave in 1.1 propagation mode and some in 1.0 propagation mode at the same time.

iherman commented 5 years ago

@azaroth42, thanks

Two things:

gkellogg commented 5 years ago

Proposal:

Allow the value of @context to be a dictionary that includes exactly two (defined) member properties, @src and @progagates.

It will also need to include "@version": 1.1 to not be misinterpreted by a 1.0 processor.

  • The value of @src is a string that is the URI of an external context to be processed, as if it were encountered as a bare string as the value of @context.
  • The value of @propagates is a boolean. If set to true, then all of the classes in the referenced context should be considered as if they had this flag set on them.

Allow a new keyword @propagates within a context root node and within a class definition.

"Class definition"? Do you mean as the embedded context in a term used as a value of @type?

When @propagates is encountered at the root node of a context document, then all classes that are defined within the context are treated as if they had the keyword assigned to the supplied value. [in the same way as @protected works]

So, it's not recursive? Seems we would need to go into a state to specifically check for this. Also, that seems like it's placing behavior for @src using @ propagates, which would seem to me to change the behavior of the context when exiting a node-definition vs uplift term definitions (and other context things) to the context including the reference to @src.

When @propagates is encountered within a class definition, and it is set to true, then this counteracts the rule described in 4.1.7 as

A context scoped on @type is only in effect for the node object on which the type is used; the previous in-scope contexts are placed back into effect when traversing into another node object.

And instead means that when that class is encountered in a type scoped environment, the current context still propagates, as it would have if @context were set in the instance data.

+1, but it probably also has a converse meaning of set to false in a context (scoped, or otherwise), to be consistent.

...

gkellogg commented 5 years ago

I think @azaroth42's suggestion might be a bit narrow, and we might want to consider the following:

  1. If @src appears within a context object, the referencing context must contain @version: 1.1.
  2. The value of @src must be an string interpreted as a URL.
  3. The behavior of @src is treated as if the referenced context were merged with the referencing context, with all term definitions from the referencing/including context taking precedence over those in the referenced context.
  4. The presence of @propagates overrides the default propagation of the context outside of the containing node object. By default, propagates is true for type-scoped contexts, and false otherwise.
  5. The specific type-scoped context rules for propagation are updated to be based on the propagates property of the specific context.

This separates the notion of @src and @propagates, and creates a consistent rule for how to merge @src into a referencing context (potentially, allowing for reclusive @src in remote contexts, although this is a consequence of the implementation, rather than a specific objective).

dlongley commented 5 years ago

@gkellogg,

By default, propagates is true for type-scoped contexts, and false otherwise.

Did you mean the reverse of this?

gkellogg commented 5 years ago

@gkellogg,

By default, propagates is true for type-scoped contexts, and false otherwise.

Did you mean the reverse of this?

Yes, indeed.

gkellogg commented 5 years ago

Also, as I said in the meeting, I think that @src is inconsistent with our keyword naming, and would prefer @source.

dlongley commented 5 years ago

If this is going to be true:

The behavior of @src is treated as if the referenced context were merged with the referencing context, with all term definitions from the referencing/including context taking precedence over those in the referenced context.

Then it seems like @import does make sense as the keyword name.

gkellogg commented 5 years ago

Perhaps, but it depends on which has the least impact on algorithms, I think. Doing the strict @source-@propagate (along with @version) seems like a special case which will require a totally separate branch in the algorithm, while @source/@import seems like potentially a 1-line change.

I'm implementing now, and will have more to say later.

pchampin commented 5 years ago

Sorry, but I find all this quite complicated... Here are two alternate proposals:

Proposal A

Proposal B

Same as proposal A, but remove the exception about type scoped contexts.

I know this would make things harder for VC, but it makes things easier to implement and to explain...

dlongley commented 5 years ago

-1 to Proposal B that would cause JSON-LD 1.1's new features to not compose by default and be unexpected for the first-order constituency of JSON developers/users.

I think Proposal A is either what @gkellogg is experimenting with in his implementation (or is close to it).

Note that there are additional fixes we needed to apply to type-scoped context processing to make them behave as expected and round-trip properly. There are real differences in how they are expected function as opposed to other contexts -- which is fine; they are a good and very useful feature that give us better alignment with idiomatic JSON. But we shouldn't forget those differences exist and that we must account for them in order to make them behave as expected. Those differences are baked into how people already think about JSON, so our processing rules must reflect that.

For example, when @type is used within a type-scoped node object, its values are compacted according to the previous context, not according to the type-scoped context.

For example, consider the case where a type-scoped context is cleared:

{
  "@context": {
    "@version": 1.1,
    "collection": "ex:collection",
    "MyType": {
      "@id": "ex:MyType",
      "@context": [null, {
        "foo": "ex:foo"
      }]
    }
  },
  "collection": [{
    "@id": "ex:some_id",
    "@type": "MyType",
    "foo": "bar"
  }]
}

Under this scenario, "MyType" would, quite unexpectedly, lose its meaning and not round trip if type-scoped contexts weren't given special treatment.

iherman commented 5 years ago

This issue was discussed in a meeting.

gkellogg commented 5 years ago

One issue I'm running into is the treatment of @protected when combined with @source. One use case would certainly be to source a context and cause all of its term definitions to be protected, but also allowing other term definitions in the wrapping context to override these terms results in an error, since only when being defined from a property (no using override protected) are such redefinitions allowed. We could enable this option if the context includes @source, but that could inadvertently allow terms that were defined in previous protected contexts to be overridden. There's really no easy way to limit this to just those terms which were defined in the sourced context.

This may be just an "oh, well ...", or perhaps we need to restrict the enclosing context from defining any term definitions, which was @azaroth42's original proposal, but I could see, for example, using schema.org, protecting the term definitions, but changing something like schema:identifier to have to be "identifier: {"@id": "http://schema.org/identifier", "@type": "@id"} rather than the default, which is missing @type.

Ideally, this would allow something like the following:

{
  "@context": {
    "@version": 1.1,
    "@protected": true,
    "@source": "https://schema.org/",
    "identifier": {"@id": "https://schema.org/identifier", "@type": "@id"}
  }
}
dlongley commented 5 years ago

@gkellogg,

One use case would certainly be to source a context and cause all of its term definitions to be protected, but also allowing other term definitions in the wrapping context to override these terms results in an error, since only when being defined from a property (no using override protected) are such redefinitions allowed.

My view of how @source/@import should work is that the terms are not defined until the wrapping context is processed. This means that any term definition that is expressed in the wrapping context wipes out any term definition from @source/@import before it is defined, avoiding any term processing issues like the above at all. I'm thinking of @source/@import plus a wrapping context working more like the object spread operator in JavaScript or its Object.assign method.

dlongley commented 5 years ago

So, to process @source/@import, first you fetch its URL value as a document via a document loader, then you parse it to get an unprocessed local context (a Map, really). Then you merge every entry in the wrapping context into that Map, replacing as needed. Finally, you do context processing on the result.

gkellogg commented 5 years ago

That would make it work, but seems like a big change to processing algorithms. Right now, it’s all about processing a local context on top of an active context. Deferring processing is challenged by the potential shape of a referenced context, array, more remote contexts, etc? They also need to be based on the active context.

We could restrict the referenced context to be in the form of a Map, but that’s a slippery slope.

Another way would be to pass something to the algorithm to tag all terms created from the sourced/imported context so we could detect that they can be overridden within the local context containing the source/import. Whatever we do, there’s a fair impact on the context processing algorithm.

dlongley commented 5 years ago

@gkellogg,

That would make it work, but seems like a big change to processing algorithms. Right now, it’s all about processing a local context on top of an active context.

The way I think about it is that context processing itself doesn't change much other than adding an additional step that handles @source/@import first, to "construct" the local context before it is processed. To me, this is not unlike how we must first use a document loader to retrieve a local context that is referenced via a URL. So it operates at a different layer than context processing "proper". Before you can process a local context, you:

  1. Dereference it if it's referred to by a URL.
  2. Dereference its @source/@import if present and merge the wrapping context into it.

So context processing itself would be "deferred" as you state next.

Deferring processing is challenged by the potential shape of a referenced context, array, more remote contexts, etc? They also need to be based on the active context.

We could restrict the referenced context to be in the form of a Map, but that’s a slippery slope.

If the value of @context in the retrieved document is an array, we could apply the wrapping context to the last context in that array. I think there would be very limited use in trying to do any more than that.

Another way would be to pass something to the algorithm to tag all terms created from the sourced/imported context so we could detect that they can be overridden within the local context containing the source/import. Whatever we do, there’s a fair impact on the context processing algorithm.

I think deferring processing could be less messy, similar to document loading, and, if it works, could also potentially match the language we use to describe how the feature works. When you @import a context, it's like editing it inline to create a new local context (per whatever changes you make in the wrapping context) before it gets processed.

dlongley commented 5 years ago

Also, I think deferred processing better matches what @azaroth42 and others would like to do. They want to avoid having to copy and paste an entire context and make a few edits to it so it can be processed with those edits. @import would give them a feature to do it -- and it would work, internally, precisely as if they had done it in the more tedious way.

gkellogg commented 5 years ago

Deferring this way does change the semantics of processing, consider the following:

Remote Context:

{
  "@context": {
    "@vocab": "http://remote.example.com/",
    "foo": {"@type": "@id"}
  }
}

Local context:

{
  "@context": {
    "@version": 1.1,
    "@source": "Remote",
    "@vocab": "http://local.example.com/"
  }
}

"foo" was would have been "http://remote.example.com/foo" if processing the remote context, but is "http://local.example.com/foo" if processed is deferred and the map resulting from processing @source is used to fold in the outer-most local context. There are a number of similar things that would affect the semantics.

Furthermore, if the remote context includes a URL itself (e.g. {"@context": ["ReallyRemote", {..}]}), you need separate logic to look for remote context overflow, and if you don't process the contexts in order, then each context could be interpreted differently vs. the deferred mechanism.

If we do that we should probably caveat, if not mandate, that the remote context must be a simple map-like local context structure, and caution that the scope of @vocab/@language/@protected along with term definitions that are used in other term definitions, could have a different result. Mandating that such contexts do not result in such confusion would require a number of new tests to test for each combination.

azaroth42 commented 5 years ago

If there is such a restriction to only allow direct mapping contexts, it would invalidate many contexts for inclusion in this way... rather defeating much of the point.

foo is clearly meant to be http://remote.example.com/foo, rather than whatever the local @vocab is set to.

So I agree with @gkellogg on the deferred processing vs regular processing.

I also (as one might expect) am :+1: to Proposal B. This is an expert feature, not something that most people will use in their daily json-ld lives. If the context writing is slightly harder, that's a relatively small price to pay.

gkellogg commented 5 years ago

@azaroth42 I think you need to clarify your support of Proposal B. That would say that type-scoped contexts propagate by default, which would certainly be a big problem for Verifiable Claims and quite arguably not what people expect from type scoping.

What's in the PR works fairly well, I think, and is essentially Proposal A (although not the third statement: "a scoped context with @propagate set to true will only be active in its scope node". I think this was intended to be when @propagate is set to false).

azaroth42 commented 5 years ago

Yes - I'm not going to lie down in the road for it, but I think that the argument that object-oriented developers would expect it is weak ... they would also expect inheritance and a closed world, neither of which we have. It's also not what anyone used to writing JSON-LD would expect from 1.0, which is going to be the majority of context authors as opposed to users of the resulting data.

Again, which ever way works such that we can fulfill the use cases is fine by me. If that's A ... great. If that's B ... great.

dlongley commented 5 years ago

@gkellogg,

"foo" was would have been "http://remote.example.com/foo" if processing the remote context, but is "http://local.example.com/foo" if processed is deferred and the map resulting from processing @source is used to fold in the outer-most local context. There are a number of similar things that would affect the semantics.

This is actually exactly what I would expect given the feature. This is the only way to "inline" edit and reuse an existing context. If you wanted @vocab to take effect after context processing, we already have a method for that and you'd do this instead:

{
  "@context": [{
    "@version": 1.1,
    "@source": "Remote"
  }, {
    "@vocab": "http://local.example.com/"
  }]
}

Adding @import provides a new feature (inline selective editing of existing contexts) that didn't previously exist. This approach seems to be a useful feature, one that would solve the original requirements, and it would be more easily understood via general principles vs. a feature that is "just for" @propagate, etc.

If we do that we should probably caveat, if not mandate, that the remote context must be a simple map-like local context structure...

While I'd be ok with that restriction, I do think it would be interesting to explore how challenging it would be to "carry through" a flag that would apply the wrapping context to the "last dereferenced context" in any series of context arrays that might be dereferenced -- to establish the final local context prior to processing. That approach would seem to match the goal of the feature.

gkellogg commented 5 years ago

Okay, that argument makes sense. I can update the PR to do the merge as you suggest, which solves the protected problem. I do believe we should restrict the shape of the referenced context to be a Map/Dictionary, as opposed to an array, or string. This covers pretty much every real-world use case and avoids unnecessary complication.

gkellogg commented 5 years ago

I've updated w3c/json-ld-api#112 with what I think we want for behavior, with the value of @souce being a string that references a remote JSON-LD file with an @context who's value is an object, which must not have @source, itself. This is reverse-merged into the referencing context, which allows things in the sourced context to be "edited" by the referencing context (including term definitions, @vocab, @protected, and so forth).

It would be straight-forward to undo the implied lack of propagation of type-scoped contexts, but I think we should separate that and consider it on a call.

Please give it a look and 👍 or 👎. Based on that, I can further describe it in the syntax document with a separate PR.