jmespath-community / jmespath.spec

JMESPath Specification
6 stars 3 forks source link

JSON grammar rules are inconsistent. #105

Closed springcomp closed 1 year ago

springcomp commented 1 year ago

The JMESPath grammar mostly includes rules for parsing valid JSON texts. However, the JSON RFC includes errata that were not included in this spec.

In particular, the two unescaped-char and escaped-char rules are inconsistent:

quoted-string     = quote 1*(unescaped-char / escaped-char) quote
unescaped-char    = %x20-21 / %x23-5B / %x5D-10FFFF
escape            = "\"
quote             = %x22   ; Double quote: '"'
escaped-char      = escape (
                        %x22 /          ; "    quotation mark  U+0022
                        %x5C /          ; \    reverse solidus U+005C
                        %x2F /          ; /    solidus         U+002F
                        %x62 /          ; b    backspace       U+0008
                        %x66 /          ; f    form feed       U+000C
                        %x6E /          ; n    line feed       U+000A
                        %x72 /          ; r    carriage return U+000D
                        %x74 /          ; t    tab             U+0009
                        %x75 4HEXDIG )  ; uXXXX                U+XXXX

The / U+002F SOLIDUS character is present in those two rules. It should, however, be excluded from the unsecaped-char rule.

This PR fixes this issue.

gibson042 commented 1 year ago

Sorry I missed this PR a couple weeks ago, but that JSON RFC erratum has not been verified and basically cannot be accepted because JSON is synchronized between the IETF and ECMA... U+002F solidus "/" is allowed inside JSON strings both unescaped and preceded by an escaping backslash.

But in JMESPath, the unescaped-char nonterminal is primarily for identifier via quoted-string and shouldn't even relate to JSON text at all.

springcomp commented 1 year ago

@gibson042 I trust you that the JSON errata has not been verified, and I agree that solidus should be valid escaped or non-escaped. I will revert this change.

However, the quoted-string rule from JMESPath is shared between identifier and json-value. JSON and JMESPath do have some relation at least.

But reverting the PR should be sufficient I reckon.

gibson042 commented 1 year ago

However, the quoted-string rule from JMESPath is shared between identifier and json-value. JSON and JMESPath do have some relation at least.

Right, and that strikes me as a problem... the keys for JSON object members must be JSON strings, but right now JMESPath inconsistently has both json-value = false / null / true / json-object / json-array / json-number / json-quoted-string (where json-quoted-string introduces a \` escape and disallows unescaped `) and member = quoted-string name-separator json-value (where quoted-string allows unescaped ` and does not have a \` escape), which doesn't make any sense.

springcomp commented 1 year ago

I think you are right. Would that be a simple oversight that could be fixed with?

member = json-quoted-string name-separator json-value
gibson042 commented 1 year ago

Yes, I think so.

springcomp commented 1 year ago

125.