lloyd / yajl

A fast streaming JSON parsing library in C.
http://lloyd.github.com/yajl
ISC License
2.16k stars 437 forks source link

Why does yajl_parser need to call abort? #142

Open jeroen opened 10 years ago

jeroen commented 10 years ago

I wrote R bindings for yajl and all works great, however it turns out yajl violates R packaging policy:

* checking compiled code ... WARNING
File ‘/Users/jeroen/workspace/jsonlite.Rcheck/jsonlite/libs/jsonlite.so’:
  Found ‘_abort’, possibly from ‘abort’ (C)
    Object: ‘yajl/yajl_parser.o’

Compiled code should not call entry points which might terminate R nor
write to stdout/stderr instead of to the console.

Upon inspection of yajl_parser.c there is indeed an abort() call at the very end.

My question: why is this call there? Under which condition is yajl supposed to exit the process? The code even has a return statement after the abort() call. Can I just comment out the abort() line if I don't want to exit, or will this have side effects?

cjgdev commented 10 years ago

That abort() call is there because the only way the parser can fall into that part of the code is if something went very wrong and the value of yajl_bs_current(hand->stateStack) does not yield a value that is in the switch statement of the state machine.

There are two reasons this might happen, either the developer has added new states and forgotten to handle them in the state machine (in which case a default statement would help capture and return a meaningful error code), or memory corruption causes a valid state to be overwritten.

The reason for the abort() and the return will be to quiet the compiler warning about a function that doesn't return a value even though the signature indicates that a value must be returned. It's considered good manners to fix such compiler warnings.

If you want to remove the abort, there will not be any side effects as such - that is to say, assuming yajl parses correctly all valid states (which does appear to be true) and there is no memory corruption (also seemingly true, assuming the client code also behaves), then the abort() is not strictly required.

However, if you are considering making a pull request, I would recommend a default case statement to catch all such errors, and maybe create a new return code to indicate a parser error.

jeroen commented 10 years ago

Thank you for the detailed answer. I am just going to comment out the abort() in my own copy of the code for now. It looks like there are many pull requests in the queue ahead of me :)