zserge / jsmn

Jsmn is a world fastest JSON parser/tokenizer. This is the official repo replacing the old one at Bitbucket
MIT License
3.65k stars 778 forks source link

How do I know if a token is a member name (key) or a value? #163

Open alevesely opened 5 years ago

alevesely commented 5 years ago

The colon terminal (':') seems to exist just for that, so it can be flagged while parsing. Doing it later requires careful counting (as in jsondump). Storing is_key requires just one bit of the token type:

--- jsmn.h  (revision 163)
+++ jsmn.h  (working copy)
@@ -67,7 +67,9 @@
  * end     end position in JSON data string
  */
 typedef struct {
-  jsmntype_t type;
+  int type:8; // byte boundary
+  int is_key:8;
+  int not_used:16; // 32 bit boundary
   int start;
   int end;
   int size;
@@ -112,6 +114,7 @@
   tok = &tokens[parser->toknext++];
   tok->start = tok->end = -1;
   tok->size = 0;
+  tok->is_key = 0;
 #ifdef JSMN_PARENT_LINKS
   tok->parent = -1;
 #endif
@@ -274,7 +277,7 @@

   for (; parser->pos < len && js[parser->pos] != '\0'; parser->pos++) {
     char c;
-    jsmntype_t type;
+    int type; /* jsmntype_t results in -Wsign-compare warning */

     c = js[parser->pos];
     switch (c) {
@@ -374,8 +377,17 @@
     case ' ':
       break;
     case ':':
+    {
+#ifdef JSMN_STRICT
+      /* In strict mode members names are strings */
+      jsmntok_t *key = parser->toknext >= 1? &tokens[parser->toknext - 1]: NULL;
+      if (key == NULL || key->type != JSMN_STRING)
+        return JSMN_ERROR_INVAL;
+      key->is_key = 1;
+#endif
       parser->toksuper = parser->toknext - 1;
       break;
+    }
     case ',':
       if (tokens != NULL && parser->toksuper != -1 &&
           tokens[parser->toksuper].type != JSMN_ARRAY &&

The third hunk (int instead of jsmntype_t) is not related to this bug. It just avoids a compiler warning, which appeared also before changing the definition of jsmntok_t. See this question at stackoverflow.

After the change, existing tests work fine. In addition, I tried this:

--- example/jsondump.c  (revision 163)
+++ example/jsondump.c  (working copy)
@@ -45,6 +45,7 @@
         printf("  ");
       }
       key = t + 1 + j;
+      if (key->is_key == 0) abort();
       j += dump(js, key, count - j, indent + 1);
       if (key->size > 0) {
         printf(": ");

That way, jsondump aborts if it's not compiled with JSMN_STRICT. Otherwise it works. I should also check that no token is marked is_key if it's not. Oh, well...

I found an old bug #23. The change envisaged there would not be backward compatible. The code mentioned there is not available. The intent, however, is similar.

alevesely commented 5 years ago

Oops, the fourth chunk above is obviously wrong. It should have been:

@@ -374,8 +377,20 @@
     case ' ':
       break;
     case ':':
+    {
+#ifdef JSMN_STRICT
+      /* In strict mode members names are strings */
+      if (tokens)
+      {
+        jsmntok_t *key = parser->toknext >= 1? &tokens[parser->toknext - 1]: NULL;
+        if (key == NULL || key->type != JSMN_STRING)
+          return JSMN_ERROR_INVAL;
+        key->is_key = 1;
+      }
+#endif
       parser->toksuper = parser->toknext - 1;
       break;
+    }
     case ',':
       if (tokens != NULL && parser->toksuper != -1 &&
           tokens[parser->toksuper].type != JSMN_ARRAY &&