kgabis / parson

Lightweight JSON library written in C.
MIT License
1.34k stars 327 forks source link

Error message if parsing fails #79

Open benswick opened 7 years ago

benswick commented 7 years ago

Currently, when parsing fails a NULL value is returned and my software can only log a "something went wrong" message. It would be great to be able to log the line number and line contents that caused the failure.

I apologize for not reading the bottom of the readme about creating issues before a pull request. Pull request #78 shows a minimally invasive way to implement this. It returns a JSONError value with an error message in the string field.

realnol commented 7 years ago

you should print log

mbelicki commented 7 years ago

This is actually quite nice solution (extending JSONValue union with error case).

I don't understand however why you are creating an error message in parson. In my opinion the library should try not handle any end user communication. It should return structure with line and column numbers and preferably a type of error. With this information the application which is using the library can form an error message in whatever way it finds suitable.

Please note that there are also some minor style issues in your patch:

benswick commented 7 years ago

@mbelicki I don't understand however why you are creating an error message in parson. In my opinion the library should try not handle any end user communication. It should return structure with line and column numbers and preferably a type of error. With this information the application which is using the library can form an error message in whatever way it finds suitable.

The API for that might look something like this:

typedef struct json_error_t JSON_Error;

enum json_error_type {
    JSONError_Unidentified = -1,
};
typedef int JSON_Error_Type;

JSON_Error * json_value_get_error(const JSON_Value *value);

int    json_error_get_type(const JSON_Error *error);
size_t json_error_get_line_number(const JSON_Error *error);
size_t json_error_get_column(const JSON_Error *error);
/* Offset from the beginning of the file or string */
size_t json_error_get_offset(const JSON_Error *error);

In the future, new json_error_type enumeration values could be added to express the variety of errors that are possible.

I have cleaned up the style inconsistencies in PR #78. I also changed remove_comments() to retain newlines so line numbers remain correct.

kgabis commented 7 years ago

Sorry for taking so long to respond, but I've been quite busy recently. I agree that there should be some information why parsing failed and where, returning NULL provides too little feedback.

Instead of extending existing function a nicer solution would be to add a new one. I really don't want people to use a different API based on a compiler flag.

Example:

enum json_parse_error_t {
    JSON_PARSE_ERROR_NONE = 0,
    JSON_PARSE_ERROR_MISSING_BRACKET,
    ...
};
typedef struct json_parse_error_t {
    enum json_parse_error_t error;
    unsigned int line;
    unsigned int line_offset;
} JSON_Parse_Error;

typedef struct json_parse_option_t {
    int accept_comments;
} JSON_Parse_Options;

JSON_Value* json_parse(const char *string, JSON_Parse_Options parse_options, JSON_Parse_Error *err); /* err is optional and can be NULL */

It's only a proposal though.

The real issue is that the API feels a bit dated and inconsistent in some places. It's been growing gradually over last 5 years and a cleanup would be nice at some point in future (parson 2.0?). I cannot guarantee any timeline, but it's something I keep at the back of my mind.

Anyway, I'd rather not accept your pull request (https://github.com/kgabis/parson/pull/78), because I don't want json_parse_string to behave differently based on a compiler flag and I think a new json_parse function is a better solution. Also, it's going to be necessary to rewrite parts of parsing code to remove comments and parse string in a single pass (it also gets rid of a malloc(sizeof(str_parsed)) that serves as a temp buffer) to get actual offset within the parsed line.

I'm going to keep this issue open because other people might have a similar problem and it needs to be resolved at some point.

benswick commented 7 years ago

I don't mind the closed pull request. It was just a starting point for what this could look like. I should have just referenced the branch on my fork.

I would suggest including the offset of the error from the beginning of the string as well. It makes it easy for applications to directly show the text that caused the failure. With only the line and offset they would have to rescan the string to count lines.

What kind of ABI stability are you going for with parson? Adding an option to the proposed JSON_Parse_Options struct would change its size.

Some option APIs:

JSON_Parse_Options * json_options_init(void);

void json_options_allow_comments(JSON_Parse_Options *options, int allow_comments);

void json_options_free(JSON_Parse_Options *options);


- Make the options global and add functions like `void set_json_option_allow_comments(int allow);` or a function like `void set_json_option(enum JSON_Option_Type option, int value)`. Not multi-threading friendly.
- Require apps to recompile to use the new version. Anyone compiling it as a library would probably dislike this option.
- If all the options are binary flags it could hold an unsigned int with each option corresponding to a bit. I could imagine wanting to set nesting and capacity limits through options, though.

A few more awkward APIs I've seen.
- Pass a string and parse the options out.
- The user could pass the size of the struct (as an argument or the first member of the struct) and a new version of the library would know which version it is by the size. If the library was compiled with a smaller version it could use only what it understands or return an error.

Technically, the proposed JSON_Parse_Error struct has the same issue. But I'm having a hard time coming up with plausible changes to it other than adding the absolute string offset. Maybe using size_t to support really large files.
benswick commented 7 years ago

I've implemented the proposed API with a few adjustments on my parse-errors-v2 branch 880a0589a024eb823a609d3d33958aae9efee266 to see what it would look like.

Changes:

Basically, I added a JSON_Parse_Error* argument to all of the parse functions. Whenever they would return NULL they also set the type of error.

While updating the tests, I noticed some test_suite_3 failures were for different reasons than the comments indicated. These three have "control character" comments after them.

I added tests so each error type has at least one test. I also added tests for the three Parse Validation failures in issue #53 that are fixed by using parse_outer_value(). If a test fails it prints the error type that was returned to make it easier to see if the parsing code was wrong or if the expected error type was wrong.