w3c / json-ld-api

JSON-LD 1.1 Processing Algorithms and API Specification
https://w3c.github.io/json-ld-api/
Other
77 stars 31 forks source link

Meaning of "ordered by" in Node Map Generation step 6.12 #586

Open danpape opened 10 months ago

danpape commented 10 months ago

Recent versions of the "Latest editor's draft" have changed many of the references of "lexicographic [order]" to "Unicode code point order" which is great.

I'm having trouble getting my implementation to pass toRDF test c019 because, I think, I'm not sure what Node Map Generation step 6.12 means when it states: "Finally, for each key-value pair property-value in element ordered by property perform the following steps:" ... what does "ordered by property" mean? Should I sort the properties by "Unicode code point order" before executing steps 6.12.x or should the properties be processed in the order they were originally inserted into the 'element' being processed?

If I sort the properties by "Unicode code point order" then the test fails because my generated blank node names created in step 6.2 are out of order. I just wanted to make sure I understood the meaning of the phrasing "ordered by property" before I simply stop sorting the properties--which does make the test pass, but leaves me worried I'm missing something.

gkellogg commented 10 months ago

There are a number of instances where the algorithm says "for each x and y ordered by x" or something similar. Unless x is a numeric value (such as length) these should generally be considered to use code point order. However, look at my own code, I can see that it considers each property from element in it's natural order, which is generally undefined. This represents a bug in the algorithm, but it's interesting that this hasn't been detected before. I consider this an Errata.

gkellogg commented 8 months ago

This issue was discussed in a meeting.

Issue w3c/json-ld-api#586
https://github.com/w3c/json-ld-api/issues/586 -> Issue 586 Meaning of "ordered by" in Node Map Generation step 6.12 (by danpape) [spec:substantive] [needs discussion] [ErratumRaised]
David I. Lehn: Do we have tests for this?
Pierre-Antoine Champin: Codepoint order is the same as lexicographical order in UTF-8.
... I wanted to be sure Rust was doing the same thing.
... There may be an issue in UTF-16.
... Internally, Javascript uses UTF-16.
David I. Lehn: I think there was a test for this.
Gregg Kellogg: I think saying that "ordered by" is in Codeppoint order is consistent with the rest of the spec, and consistent with current test results.
... That would make the change editorial, not substantive.