lloyd / yajl

A fast streaming JSON parsing library in C.
http://lloyd.github.com/yajl
ISC License
2.15k stars 435 forks source link

Add yajl_reset_stack() to recover from parsing failure #161

Open ppisar opened 9 years ago

ppisar commented 9 years ago

libbons project currently bundles yajl because it needs to support recovering from yajl parsing failures by resetting the byte stack. See [https://jira.mongodb.org/browse/CDRIVER-597].

Followin patch adds yajl_reset_stack() public function which could satisfy libbson need:

From d42168311963a6827b0c9d547d8efdf0e7fbdcae Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Petr=20P=C3=ADsa=C5=99?= <ppisar@redhat.com>
Date: Fri, 24 Apr 2015 14:51:24 +0200
Subject: [PATCH] Add yajl_reset_stack()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

yajl_reset_stack() resets parser stack and initilize it to
yajl_state_start. This is needed by libbson library to recover from
parsing failure without need to destroy the handle completely.

See <https://jira.mongodb.org/browse/CDRIVER-597>.

Signed-off-by: Petr Písař <ppisar@redhat.com>

diff --git a/src/api/yajl_parse.h b/src/api/yajl_parse.h
index 1c25a60..8ba903a 100644
--- a/src/api/yajl_parse.h
+++ b/src/api/yajl_parse.h
@@ -168,6 +168,9 @@ extern "C" {
     /** free a parser handle */
     YAJL_API void yajl_free(yajl_handle handle);

+    /** Flush parser stack to initial state. All other settings are preserved. */
+    YAJL_API void yajl_reset_stack(yajl_handle handle);
+
     /** Parse some json!
      *  \param hand - a handle to the json parser allocated with yajl_alloc
      *  \param jsonText - a pointer to the UTF8 json text to be parsed
diff --git a/src/yajl.c b/src/yajl.c
index d477893..1cb7c51 100644
--- a/src/yajl.c
+++ b/src/yajl.c
@@ -114,6 +114,13 @@ yajl_free(yajl_handle handle)
     YA_FREE(&(handle->alloc), handle);
 }

+void
+yajl_reset_stack(yajl_handle handle)
+{
+    handle->stateStack.used = 0;
+    yajl_bs_push(handle->stateStack, yajl_state_start);
+}
+
 yajl_status
 yajl_parse(yajl_handle hand, const unsigned char * jsonText,
            size_t jsonTextLen)
-- 
2.1.0
ajdavis commented 9 years ago

Hope we can get this accepted and released this quarter, it'd be a big help for the MongoDB libbson project if we can use the standard YAJL API instead of hacking into YAJL's internals.

cicku commented 9 years ago

Seems @lloyd is not available.

ajdavis commented 8 years ago

Hi @lloyd, I could really use some response to this request. I am willing to help in any way you like, whatever you require to prevent you from doing too much work.

lloyd commented 8 years ago

hrm. this patch doesn't seem like it would properly re-initialize lexer state?

lloyd commented 8 years ago

Also, to be clear, the motivation for this patch is performance? Reallocation of the yajl parser for distinct documents is slowing you down?

ajdavis commented 8 years ago

Yes, we're trying to avoid reallocation when we go from one document to the next, parsing something like:

{"a": 1}{"a": 2}
lloyd commented 8 years ago

ok, give me a minute to hand you a patch on a branch to review.

ajdavis commented 8 years ago

Great! Yeah, I'm looking just a little more deeply at this, and it seems some of the lexer's state should be cleared. This is enough for libbson's use case:

   yh->stateStack.used = 0;
   yajl_bs_push (yh->stateStack, yajl_state_start);

... but it's not fully correct for other users.

lloyd commented 8 years ago

@ajdavis how's this look? 646b8b82ce5441db3d11b98a1049e1fcb50fe776

ajdavis commented 8 years ago

Thanks, let me test this with libbson

lloyd commented 8 years ago

:+1:

ajdavis commented 8 years ago

So far so good. One thing I've wondered about is whether we properly report the error location if we encounter a syntax error after the first document, so I'll add a test for that and report back again. =)

ajdavis commented 8 years ago

There's some work to do regarding properly showing the location of a parse error (the (right here) ------^ message) but I'm pretty sure the problem's in libbson, not YAJL.

So LGTM!

lloyd commented 8 years ago

ok, I'll merge down and roll a new release within a day.

lamont-granquist commented 8 years ago

hey @lloyd would you consider adding other maintainers?

ppisar commented 8 years ago

Thank you very much for implementing the yajl_reset(). I'm looking forward for the new release.

lloyd commented 8 years ago

I would absolutely consider additional maintainers. Who's interested?

--lloyd

On Sep 25, 2015, at 10:21 AM, Lamont Granquist notifications@github.com wrote:

hey @lloyd would you consider adding other maintainers?

— Reply to this email directly or view it on GitHub.

ajdavis commented 8 years ago

I'd like to try, if you'd provide me with some guidance when I need it.

On Thursday, October 1, 2015, Lloyd Hilaiel notifications@github.com wrote:

I would absolutely consider additional maintainers. Who's interested?

--lloyd

On Sep 25, 2015, at 10:21 AM, Lamont Granquist <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

hey @lloyd would you consider adding other maintainers?

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/lloyd/yajl/issues/161#issuecomment-144904557.

lamont-granquist commented 8 years ago

I'm definitely interested. I'm the primary author of http://github.com/chef/ffi-yajl and would like to get some of the windows build stuff into better shape and stop maintaining internal patches at the very least...

remicollet commented 8 years ago

ok, I'll merge down and roll a new release within a day.

@lloyd any news about this ?

bjori commented 8 years ago

@lloyd ping?

yanatan16 commented 8 years ago

:+1: I also implemented this in #177 but am deferring to this PR to be merged.

lafeuil commented 8 years ago

@lloyd Are there any news to merge the patch in the master branch and roll a new release ? Without this release, it is impossible to integrate some libraries in distributions.

Is there a problem with maintenance of yajl ? It seems that there are some developers ( @lamont-granquist, @ppisar ) that would be happy to maintain it.

nmiller12 commented 6 years ago

It would be great if yajl was maintained so that it can be added in distributions like RHEL. (there is currently no yajl-devel package)