w3c / json-ld-framing

JSON-LD 1.1 Framing Specification
https://w3c.github.io/json-ld-framing/
Other
25 stars 20 forks source link

Framing and Arrays #111

Closed about-code closed 4 years ago

about-code commented 4 years ago

Framing Recommendation claims:

Framing is used to shape the data in a JSON-LD document, using an example frame document which is used to both match the flattened data and show an example of how the resulting data should be shaped.

This sounds to me as if JSON-LD-Parsing + JSON-LD-Framing may be meant to help with mapping back-and-forth between data models carrying semantically equivalent data but having chosen different syntactic models for it. At least this is a common problem I often see in practical data integration and data exchange scenarios. So I tried to test this hypothesis with a pretty simple example of putting the Library input data from the JSON-LD v1.1 Framing Recommendation into a different shape.

I began with this frame:

Frame

{
  "@context": {
    "@vocab": "http://example.org/"
  },
  "@type": "Library",
  "contains": {}
}

It produces

Framed Output

{
  "@context": {
    "@vocab": "http://example.org/"
  },
  "@id": "http://example.org/library",
  "@type": "Library",
  "contains": {
    "@id": "http://example.org/library/the-republic",
    "@type": "Book",
    "contains": {
      "@id": "http://example.org/library/the-republic#introduction",
      "@type": "Chapter",
      "description": "An introductory chapter on The Republic.",
      "title": "The Introduction"
    },
    "creator": "Plato",
    "title": "The Republic"
  },
  "location": "Athens"
}

Basically the output seems to me like just being a different way to render the tree model in the input @graph adjacency list. However the purpose was to map the input model onto some slightly different one. So what I looked for was a frame which maps the input model onto a model which chooses to have a books array and a chapters array rather than scalar contains properties:

Desired Framed output

{
  "@context": {
    "@vocab": "http://example.org/"
  },
  "@id": "http://example.org/library",
  "@type": "Library",
  "books": [{
    "@id": "http://example.org/library/the-republic",
    "@type": "Book",
    "chapters": [{
      "@id": "http://example.org/library/the-republic#introduction",
      "@type": "Chapter",
      "description": "An introductory chapter on The Republic.",
      "title": "The Introduction"
    }],
    "creator": "Plato",
    "title": "The Republic"
  }],
  "location": "Athens"
}

However, this is the closest frame I could get to after hours of trial and error:

{
  "@context": {
    "@vocab": "http://example.org/",
    "books": {
      "@id": "contains",
      "@context": {
        "chapters": {
          "@id": "contains", 
          "@container": "@set"
        }
      }
    }
  },
  "@type": "Library",
  "contains": {}
}

It produces a result

{
  "@context": {
    "@vocab": "http://example.org/"
  },
  "@id": "http://example.org/library",
  "@type": "Library",
  "books": {
    "@id": "http://example.org/library/the-republic",
    "@type": "Book",
    "chapters": [{
      "@id": "http://example.org/library/the-republic#introduction",
      "@type": "Chapter",
      "description": "An introductory chapter on The Republic.",
      "title": "The Introduction"
    }],
    "creator": "Plato",
    "title": "The Republic"
  },
  "location": "Athens"
}

which obviously misses the books array, though. So logically one would conclude: what's left missing is some "@container": "@set" in the mapping which maps "contains" onto "books". So let's do it:

That Frame should do it:

{
  "@context": {
    "@vocab": "http://example.org/",
    "books": {
      "@id": "contains",
      "@container": "@set",    // <= let's make books an array
      "@context": {
        "chapters": {
          "@id": "contains", 
          "@container": "@set"
        }
      }
    }
  },
  "@type": "Library",
  "contains": {}
}

Failed: for some reason now the output changes to

{
  "@context": {
    "@vocab": "http://example.org/"
  },
  "@id": "http://example.org/library",
  "@type": "Library",
  "books": [{
    "@id": "http://example.org/library/the-republic",
    "@type": "Book",
    "books": [{
      "@id": "http://example.org/library/the-republic#introduction",
      "@type": "Chapter",
      "description": "An introductory chapter on The Republic.",
      "title": "The Introduction"
    }],
    "creator": "Plato",
    "title": "The Republic"
  }],
  "location": "Athens"
}

wiping out chapters. That's where I gave up. I know this is not a support forum, but after so much time spent on a dozen different frames I am not even sure anymore the recommendation provides any solution to that problem. If so, what's the solution I don't see? Or is this just a nasty bug in json-ld.js used on the JSON-LD playground?

Thanks in advance for any response.

dlongley commented 4 years ago

Here's the frame I think you're looking for:

{
  "@context": {
    "@vocab": "http://example.org/",
    "Library": {
      "@context": {
        "books": "contains"
      }
    },
    "Book": {
      "@context": {
        "chapters": "contains"
      }
    }
  },
  "@type": "Library",
  "books": {
    "@type": "Book",
    "chapters": {
      "@type": "Chapter"
    }
  }
}

Playground link: https://tinyurl.com/y5rbuv53

Edited to match your above input which is slightly different from the "Library" example on the playground.

dlongley commented 4 years ago

This sounds to me as if JSON-LD-Parsing + JSON-LD-Framing may be meant to help with mapping back-and-forth between data models carrying semantically equivalent data but having chosen different syntactic models for it.

Yes, that's a use case for it.

That's where I gave up. I know this is not a support forum, but after so much time spent on a dozen different frames I am not even sure anymore the recommendation provides any solution to that problem. If so, what's the solution I don't see? Or is this just a nasty bug in json-ld.js used on the JSON-LD playground?

In this situation it looks like you want to base your terms on type-scoped contexts, not property-scoped contexts. Your desired output calls for "contains" to be represented using the term "books" when it's in a library (type "Library"), but you want "contains" to be using the term "chapters" when it appears on a type "Book". Therefore, I used type-scoped contexts to solve this case.

I hope this helps.

dlongley commented 4 years ago

A little more information for you. When using this context definition:

{
  "@context": {
    "@vocab": "http://example.org/",
    "books": {
      "@id": "contains",
      "@container": "@set",    // <= let's make books an array
      "@context": {
        "chapters": {
          "@id": "contains", 
          "@container": "@set"
        }
      }
    }
  },
  "@type": "Library",
  "contains": {}
}

The term "books" is first defined -- globally. Then the term "chapters" is defined -- via a property-scoped context. So "chapters" will only be defined within the scope of any value of the "books" property. However, the term "books" is itself still defined for any value of "books" -- because "books" is defined globally.

This means that the JSON-LD processor, when looking at the property "http://example.org/contains" on a value of "books", must choose between two possible terms: "books" and "chapters". The current term selection rules do not use scoping as a preference when there are multiple choices (which is not intuitive) -- it instead uses the original term selection rules from JSON-LD 1.0, before "scoped contexts" were introduced. The term selection rules will choose the shortest and lexicographically (alphabetically first) option, which is, in this case, "books". That's why "chapters" doesn't show up here.

I can't remember if the JSON-LD Working Group considered introducing "closest scope" into the term preference algorithm or if it was decided that it was either too complicated or would be a breaking change with 1.0. Either way, it didn't make the cut.

If you wanted to keep the context you have with a minimal change, you could undefine "books" like this:

{
  "@context": {
    "@vocab": "http://example.org/",
    "books": {
      "@id": "contains",
      "@container": "@set",
      "@context": {
        "books": null, // clears the "books" term definition for values of "books"
        "chapters": {
          "@id": "contains", 
          "@container": "@set"
        }
      }
    }
  },
  "@type": "Library",
  "contains": {}
}

However, it did seem like you were going for definitions that were based on the types of objects more than on how they happened to be related. For example, I imagine that if a library had mentioned that it did not contain a book, perhaps via a term "missing"/"checkedOut" (or whatever), you'd still want any book there to express its contents using the term "chapters". So it seemed most appropriate to express the difference in how the terms appeared in the data using type-scoped contexts.

about-code commented 4 years ago

The current term selection rules do not use scoping as a preference when there are multiple choices (which is not intuitive) -- it instead uses the original term selection rules from JSON-LD 1.0, before "scoped contexts" were introduced. The term selection rules will choose the shortest and lexicographically (alphabetically first) option

@dlongley Thank you very much for digging into the example and for the enlightening responses. Really helpful and surprising. Indeed I got so used to scoped contexts in source documents that, intuitively, I could have sworn they can be used for term selection in frames, the same.

However, it did seem like you were going for definitions that were based on the types of objects more than on how they happened to be related.

Well anticipated. Regarding your initial example: I had to reintroduce scoped contexts again for the sake of add "@container": "@set" in order to get the desired arrays. So eventually, the frame I am happy with and which produces the desired output looks as follows:

{
  "@context": {
    "@vocab": "http://example.org/",
    "Library": {
      "@context": {
        "books": {
          "@id": "contains",
          "@container": "@set"
        }
      }
    },
    "Book": {
      "@context": {
        "chapters": {
          "@id": "contains",
          "@container": "@set"
        }
      }
    }
  },
  "@type": "Library",
  "books": {
    "@type": "Book",
    "chapters": {
      "@type": "Chapter"
    }
  }
}