parsica-php / parsica

Parsica - PHP Parser Combinators - The easiest way to build robust parsers.
https://parsica-php.github.io/
MIT License
405 stars 18 forks source link

Unique keys in JSON-object #9

Open HermanPeeren opened 4 years ago

HermanPeeren commented 4 years ago

The current JSON-parser in Parsica has the same behaviour as json_decode regarding non-unique keys in a JSON-object: it just overwrites the value by the value of the last occurence of that key:

$JSON = '{"key1":"value1","key2":"value2a","key2":"value2b","key3":"value3"}';
$object = json_decode($JSON);
var_dump($object);

gives:

object(stdClass)[1]
  public 'key1' => string 'value1' (length=6)
  public 'key2' => string 'value2b' (length=7)
  public 'key3' => string 'value3' (length=6)

PHP uses RFC 7159 for its JSON-definition, which states in section 4: "The names within an object SHOULD be unique" ("names" == "keys"). If you want to use the parser to check the validity of the JSON-input, then it should give a warning or error. PHP's json_decode doesn't do that; it just overwrites the value.

In the original JSON-definition RFC 4627 object-keys don't need to be unique (although that doesn't make much sense to me). Checking for unique keys makes the JSON-definition context-sensitive. See this posting on Stackoverflow.

In Parsica the key-value-pairs are sequentially written in an associative array, which is then cast in an object.

Parsing context-free languages is generally simpler than context-sensitive languages. Context-sensitivity cannot be expressed in EBNF / ABNF grammars. But context-sensitivity is a necessary property for a Turing-complete language.

mathiasverraes commented 4 years ago

Context sensitivity will be supported when we introduce parser state in Parsica. The current plan is to do something similar to Haskell's State Monad, but no work has been done in that regard yet.

As for the json parser: