digitalbazaar / jsonld.js

A JSON-LD Processor and API implementation in JavaScript
https://json-ld.org/
Other
1.66k stars 195 forks source link

Control order of graph nodes and children nodes fromRdf #395

Open dnllowe opened 4 years ago

dnllowe commented 4 years ago

I was previously working with a team that needed to preserve the order of parent and child nodes when created from RDF.

For example, given the quads below, we'd want to control whether subjects 1, 2, and 3 get sorted, and whether their children (properties 1, 2, and 3) are sorted. As of now, fromRdf will always sort the parent nodes and children (Subject 1, Properties 1,2,3; Subject 2, Properties 1,2,3 etc...), losing the original order:

<http://example.com/Subject3/Property3> <http://example.com/value> "3" <http://example.com/Subject3> .
<http://example.com/Subject3/Property1> <http://example.com/value> "1" <http://example.com/Subject3> .
<http://example.com/Subject3/Property2> <http://example.com/value> "2" <http://example.com/Subject3> .

<http://example.com/Subject1/Property3> <http://example.com/value> "3" <http://example.com/Subject1> .
<http://example.com/Subject1/Property1> <http://example.com/value> "1" <http://example.com/Subject1> .
<http://example.com/Subject1/Property2> <http://example.com/value> "2" <http://example.com/Subject1> .

<http://example.com/Subject2/Property3> <http://example.com/value> "3" <http://example.com/Subject2> .
<http://example.com/Subject2/Property1> <http://example.com/value> "1" <http://example.com/Subject2> .
<http://example.com/Subject2/Property2> <http://example.com/value> "2" <http://example.com/Subject2> .
goofballLogic commented 4 years ago

@gkellogg can you confirm if this approach is correct in theory? I'm holding off on approving https://github.com/linked-data-dotnet/json-ld.net/pull/41 b/c I'm not confident that this honours the sorting semantics of the spec.

gkellogg commented 4 years ago

ToRdf does it’s sorting as part of the expansion algorithm, which is actually optional. Any order to the resulting triples is an artifact of the algorithm execution. List nodes, in particular, may not come out in an expected order.

Generally, I’m. It a fan of imposing such an order, as graphs are inherently unordered (this the need for the List ladder).

Implementations have no normative requirement to honor such ordering, and there is an option to control it, but no tests to verify such output ordering.

It may be that the best practice it to perform ordering after the algorithm has completed, but an application-specific option to perform some specific ordering should be fine. I can’t speak for Digital Bazaar and if they would accept such a PR to jsonld.js or PyLD.

dnllowe commented 4 years ago

@gkellogg Thanks for taking a look and sharing your thoughts. If it helps, for context, the company I was working with was mapping database calls into RDF format, then later using the json-ld.net library to return JSON-LD to either a frontend or directly to our clients. There was a requirement to preserve the order from the original database query throughout its use in the application.

In our case, we saw maintaining the order during FromRDF as the cleanest and most efficient solution, since the order from the original database query was preserved at the previous stages.

gkellogg commented 4 years ago

Sorry if I confused FromRdf and ToRdf, which is usually where ordering issues arise. The FromRdf algorithm, as stated, should be fairly order preserving, although if compacting or framing the result, that order could be changed. The specs don’t require any particular ordering, other than for list elements, so any implementation-specific ordering is perfectly fine from a conformance perspective.

dnllowe commented 4 years ago

Sorry if I confused FromRdf and ToRdf, which is usually where ordering issues arise. The FromRdf algorithm, as stated, should be fairly order preserving, although if compacting or framing the result, that order could be changed. The specs don’t require any particular ordering, other than for list elements, so any implementation-specific ordering is perfectly fine from a conformance perspective.

Sorry--that might be partially my mistake: I added my test case to other toRdf tests instead of other fromRdf tests (fixed now). I did add a test verifying the default behavior/order of fromRdf as well--showing that fromRdf produces ordered output.

396