Closed WebReflection closed 6 years ago
In principle, I certainly like the idea of having a JSON structure to reference another. Any thoughts on how your proposed scheme could be reasonably encoded as a URI fragment identifier?
Interesting question. May I ask when/where this is needed since the whole JSON is not too friendly as URI fragment?
My best guess is that if it's so relevant, either we use exact same convention is in place now but integers are distinguished by a single dot as suffix, like 1.~prop~2~a~3. could be or I need to better understand use cases and real world examples to better think about it, thanks .
That was 1. ~ prop ~ 2 ~ a ~ 3.
The fragment identifier is used to identify a JSON value in a JSON document referenced by a URI. It's a stated objective of the RFC, equivalent in purpose of (and inspired by) the XML Pointer specification. In a notable example, it's used in JSON Schema in "$ref" JSON references.
edited to reflect latest proposal which is IMO superior
Given the following document
{
"foo": ["bar", "baz"],
"": 0,
"a/b": 1,
"c%d": 2,
"e^f": 3,
"g|h": 4,
"i\\j": 5,
"k\"l": 6,
" ": 7,
"m~n": 8,
"a.b": 9,
"0": 10,
"1": ["biz"],
"'2'": 12
}
We could represent each access through the following notation:
# // the whole document
#.foo ["bar", "baz"]
#.foo.0 "bar"
#. 0
#.a/b 1
#.c%25d 2
#.e%5Ef 3
#.g%7Ch 4
#.i%5Cj 5
#.k%22l 6
#.%20 7
#.m~n 8
#.a%2Eb 9
#.'0' 10
#.'1'.0 "biz"
#.%272%27 12
The accessing semantic reflects common object properties access through a .
dot.
Whenever an index is a string and it looks like a number, it's wrap through '
single quote reflects the explicit semantic intent.
The choice of a sub-delims respects its RFC3986 purpose:
If a reserved character is found in a URI component and no delimiting role is known for that character, then it must be interpreted as representing the data octet corresponding to that character's encoding in US-ASCII.
Since all keys need to be uri-decoded, when special characters such .
and '
are part of the key,
a surely not so common case, their representation must be encoded as %2E
for the dot and %27
for the single quote.
The common Array index accessor is preserved as integer.
The following regular expression would match any array index, including negatives: ^-?[0-9]+$
Integers are preserved in encoding thanks to their type information (an index of an Array is numeric, a key in a hash/object is string)/
In the not so common case a key is represented by a plain number and matches the previous regular expression, it must be wrapped in single quotes so that it'll look like '1'
or even '-1'
.
Most common JSON cases will be easier to read and write.
.foo.bar.0.name
.foo.bar.-1.age
The semantic doesn't need much explanation since it's regular JavaScript notation with the exceptions for indexes but it's easy to reason about the choice.
Same applies for numeric keys
.foo.bar.-1.biz.'123'.role
There is no ambiguity or specific order of encoding/decoding to remember so this proposal is also less error prone for humans.
A human reader/writer needs to remember both %2E
and %27
have special meaning.
However, this is in my opinion easier than remembering what's ~0
, what's ~1
, and in which order these must be encoded or decoded.
Thoughts ?
Instead of opening closing parenthesis, we could consider using just '
single quote.
Whenever an object key is numeric, it's access would be better described by the quote
.foo.bar.0.name
VS .foo.bar.'1'.name
and the amount of characters to remember would be just 2, but not conflicting with each other.
Same benefits, less to remember.
Accordingly, given the following document
{
"foo": ["bar", "baz"],
"": 0,
"0": 1,
"'": 2
}
we can access all properties as such:
# // the whole document
#.foo ["bar", "baz"]
#.foo.0 "bar"
#. 0
#.'0' 1
#.%27 2
I actually prefer this version over the previous one.
FYI I have implemented a 100% code-covered version of the latest proposal: https://github.com/WebReflection/json-pointer
<!doctype html>
<script src="https://unpkg.com/better-json-pointer@0.0.0/min.js"></script>
<script>
var encoded = JSON.Pointer.encode(['a', 0, '1', "'b.'"]);
console.log(encoded);
// ".a.0.'1'.%27b%2E%27"
console.log(JSON.Pointer.decode(encoded));
// ["a", 0, "1", "'b.'"]
</script>
is there any progress on this?
@WebReflection - so this issues list isn't a decision-making body, it's just where we can discuss things about a potential future effort.
Your proposal looks reasonable, but if we wanted to incorporate it into a future format, we'd have to determine whether the benefits outweighed the costs of having Yet Another Way To Do This.
It seems like there's a decent amount of interest in this, but we won't know for sure until we decide to wrap things up and start an effort to define a new spec.
thanks for the update. I just go through my issues from time to time and this was still open, hence my question. I'll keep it open then. Regards
@mnot - JSON patch has been very useful for me but I've been hitting limits with being able to apply generic patches across documents. I started introducing tools like http://jsonpath.com/ to make my patching more scalable. It's ability to filter and do recursive searching can make updating json objects in any document structure much easier.
nothing is happening here, so I remove this ticket from my list of issues.
Writing here as suggested by Mark Nottingham.
The RFC6901 specification looks very JSON-unfriendly, re-inventing arrays through escaped strings, easily error prone while manipulating any path in projects such just-diff.
A different proposal
A JSON pointer can be just an
Array
requiring zero special cases and granting type preservation. Being an Array, a JSON pointer can be used also as generic JavaScript path pointer including the ability to have aSymbol
as part of the key.Examples
Given the following structure:
Instead of having
/biscuits/1/name
as path to point at"Choco Leibniz"
string, the path could be["biscuits", 1, "name"]
.Since a path can be specified to use integers when it comes to
Array
access, it's now never ambiguous to understand the structure.An object with numeric entries will have these represented as strings, an Array will have these represented as integers.
Accordingly, to point at
"Choco Leibniz"
we can now use["biscuits", "5", "name"]
.Edge cases
Empty keys are allowed via
[""]
and at any level as in["", 2, "", "name"]
.Last item
Using the convention that an integer, if negative, means from the
Array
length, it is possible to point at a generic last item via-1
, as example via["biscuits", -1]
.If the length of the array is meant, as appending at the end, being
length
an array property the path could simply be["biscuits", "length"]
.Improvements
As typed array, JSON pointer would be a list/set that can contain either strings or integers.
As a wider solution, a JSON pointer could contain also JavaScript symbols which is a generic object key by all means, but this is out of current JSON Pointer scope/proposal.
Thanks in advance for any sort of outcome.