zserge / jsmn

Jsmn is a world fastest JSON parser/tokenizer. This is the official repo replacing the old one at Bitbucket
MIT License
3.65k stars 778 forks source link

Line numbers in tokens #146

Open kjhermans opened 5 years ago

kjhermans commented 5 years ago

May I suggest we have line numbers in tokens? It would help greatly to debug faulty JSON files.

pt300 commented 5 years ago

Tokens already store their position in the input string. I see no point in implementing what you suggested as this would needlessly increase use of memory by JSMN.

kjhermans commented 5 years ago

If you individually popped tokens (for example, through a callback), or if there were a way to get a more proper error message (the main structure can store the current line) on error?

pt300 commented 5 years ago

Seems like what you want is a fully featured JSON parsing library. If that's the case then I'd suggest you to look for alternatives. JSMN is quite minimalistic and has a bit of problems.

kjhermans commented 5 years ago

No, not really. What I really, really appreciate about jsmn is the terseness of the library, which gives me a good indication of they way it will hold up in a security evaluation (# of LoC). My comment about the #ifdef in the typedef comes from this perspective (makes it less secure, because it may inadvertently be interpreted differently by the library compiler and the compiled-library user). My comment about the line numbering was more of me translating an end-user complaint (end-user likes to know where the parsing goes wrong), and I don't care that much about it myself. Besides - because I know the position in the JSON string, I can retroactively recover the line number anyway, by counting it myself.

pt300 commented 5 years ago

I see what you mean. But for the line numbering I'd be more interested in specific point where the error happened, not line. That's because I could have JSON data which is not neatly formatted and has a lot happening in one line.

kjhermans commented 5 years ago

Something like this?

/**
 * Counts the number of lines in [string], until [pos] is reached.
 * Calculates the x,y position inside [string], of [pos].
 * Returns the x,y position of a position in a string.
 *
 * \param string  Zero-terminated string, preferably longer than [pos] bytes.
 * \param pos     Position inside the string.
 * \param vector  Contains the line (off-by-one) [0], and position on that
 *                line (off-by-one) [1], on successful return.
 *
 * \returns       Zero on success (pos is inside string), or non-zero on error.
 */
int strxypos
  (char* string, unsigned pos, unsigned vector[ 2 ])
{
  unsigned i = 0;

  vector[ 0 ] = 1;
  vector[ 1 ] = 1;
  while (i < pos) {
    switch (string[ i++ ]) {
    case '\n':
      ++(vector[ 0 ]);
      vector[ 1 ] = 0;
      break;
    case 0:
      return ~0;
    }
    ++(vector[ 1 ]);
  }
  return 0;
}
dominickpastore commented 4 years ago

Line numbers could be useful, but many users of jsmn are on memory constrained systems. If such features are added, they should probably be wrapped in #ifdef JSMN_LINENO (or similar), like we have for parent links.