Open Anorov opened 9 years ago
Hi, @Anorov and @bungle Thank you so much for reporting this problem. I just took a look.
for OOM. I don't think this is a real issue. If I call collectgarbage() before the decoder is invoked, the problem disappears. All memory allocation taking place at C-land does not interferce with luajit; the only big overhead in the C/Lua converter is an intermediate array holding the references to the objects being converted. I think this is a reasonable overhead.
Thank you again for report the problem. I will fix the defets ASAP.
Great to hear that.
The OOM is probably just my fault. When I did run the test separately it didn't OOM (I have also got some OOMs with other LuaJIT bindings, but no OOMs with bundled cjson).
And yes, my test is just a small test I run to validate my code (it only checks the time spend, but not other things, like memory usage etc.). It seems that different implementations get quite similar results. There doesn't seem to be "one to rule them all" json lib that wins on all occasions. But there are some defects in some libs (or in my LuaJIT binding implementation or platform related problems), like this one that I'm trying to figure out: https://github.com/vurtun/json/issues/1, and that performance of my lua-resty-libcjson encoding (not decoding) is totally shit performance wise (mind you that you will only notice it in really big files like that citylots). My citylots is not a really practical test, as you don't usually deal with this big json files in general. In my smaller file test your lib seemed among the fast ones (probably the fastest, but not with a long margin).
In no way I'm saying that there are problematic defects in your code (it seems to fare really well, at least no any big issues). My conclusion so far is to use bundled cjson (Lua cJSON 2.1) (it's reliable, and good performance overall, and has support for encoding and decoding, and no OOM problems). If you need stream parsing, use something else. It seems that Lua part of the code in my libs will almost double the time compared to that what's spend on C-side.
I will continue retesting, and I wil report if I found something new.
Thx!
Tonight I got chance to rewrite the function which parse floating-ponit value.
Comparing cjon and lua-resty-json on my laptop (x240) with https://raw.githubusercontent.com/zemirco/sf-city-lots-json/master/citylots.json, the performance was 2.34s (cjson) vs 2.49s, and is 2.34s vs 1.7s.
I will polish the change sometime next week. floating-point is bit tricky, I will also add bunch of unit testing cases, of course:-)
Thanks
Sounds like a huge improvement. I will check it on my machine when you get it checked in. Looks great!
Oops, speak too soon. After I almost "finish" the work, I realize my implementation is not precise in parsing fp number. neither dose cjson, IIRC. I will check @bungle's implementation. I will steal his code if it's correct in this regard.
My implementation could incur up to 1/2**53 relative error, which does not seems to be a big deal in most situations. but picky folks could still complains.
My holy grail would be http://www.exploringbinary.com/correct-decimal-to-floating-point-using-big-integers/.
@yangshuxin, the relevant code is this: https://github.com/bungle/libopjson/blob/master/json.c#L161-L240
I haven't tested it, and I don't know if it has the same problems. It is basically the same code as in @vurtun/json, but with a small modifications.
The code I referenced is adviced to compile with -O3 -fno-gcse -fno-crossjumping
(although I have seen better results with -fno-crossjumping
removed).
In my cityjson test lua-resty-opjson
is now the fastest one, although the Lua code side of it is not really optimized (uses recursion, for example). I'm looking forward to test your modifications in lua-resty-json
.
@bungle, thank you for pointing me to the code. As far as I can tell, it compuate the fraction part, say 0.1234, by 101+ 2(0.1_0.1) + ... 4 * (0.10.1.01_0.1). I would say this is very imprecise. More imprecise than the variant I implemented enabled by -DFP_RELAX=2. That being said, if the application is built with -fast-math CFLAGS, it's perhas ok for such imprecision.
FYI:
#include <stdio.h>
double __attribute__((noinline))
foo(double factor) {
return 0.1 * factor + 0.1 * 0.1 * factor;
}
int
main(int argc, char** argv) {
double d1 = foo(6);
double d2 = 0.66;
fprintf(stderr, "%.20E vs %.20E, relative error %.20E\n", d1, d2, (d1-d2)/d2);
fprintf(stderr, "%d\n", memcmp(&d1, &d2, 8));
return 0;
}
run the code gives: 6.60000000000000142109E-01 vs 6.60000000000000031086E-01, relative error 1.68215609791690383419E-16 1
I did a test, the json file size is 22m, decode 100 times, cjson use time: 25.55s lua-resty-json use time: 31.52s
So I still use cjson.
@lgh5549294 ,
This looks quite awesome as well: https://github.com/xpol/lua-rapidjson
@lgh5549294 Mind to share your JSON file?
@agentzh my test json file and lua file has send to your email!!
@lgh5549294 , can I have a copy of JSON file as well. Or can I get a copy from @agentzh ? Thanks
I don't really have an issue to raise, however I wanted to bring to your attention this benchmark of different JSON parsing libraries for OpenResty:
https://github.com/bungle/lua-resty-libcjson/issues/3
At the moment, lua-resty-json is supposedly the slowest of the pack.