ibmruntimes / yieldable-json

Asynchronous JSON parser and stringify APIs that make use of generator patterns
Other
145 stars 22 forks source link

parseAsync Sometimes blocks for a *long* time #36

Closed teetering closed 3 days ago

teetering commented 1 year ago

I've been testing the library recently and while it sometimes works well, the call to parseAsync sometimes blocks for a long long time (ex 1 minute). I'm not talking about the time to parse the JSON string, but rather the blocking time before parseAsync returns (and before the parse takes place).

I'm still trying to figure out what's tickling it. Perhaps it is the same as mentioned here:

https://github.com/ibmruntimes/yieldable-json/issues/18

Regardless, this obviously defeats the whole point of the library. I think it's important to give a warning up front at the top of the README that this library shouldn't be used in any production systems. This will surely bite a lot of people.

gireeshpunathil commented 1 year ago

do you have a program that recreates the issue?

I think it's important to give a warning up front at the top of the README that this library shouldn't be used in any production systems.

the whole point of the library is to use in production. Instead of documenting defects, I would rather fix it. if you have a stable recreate, pls pass it on, that way you will also be part of improving this module!

teetering commented 1 year ago

First of all I want to thank you for your work on this library; incredible effort! I know that it is not an easy endeavor, and I personally believe that it is a critical piece that is conspicuously missing from the node native functionality.

That said, I do think it's important to state explicitly up front that it is not production ready (until it is). In my example (from the real-world), yieldable-json is causing the event loop to block for close to 1 minute which would obviously be fatal to any production system.

parse-test.tar.gz

Attached is a simple repro. The program starts with setInterval(() => { console.log('*heartbeat*') }, 1000); Here is the output when run it on my computer:

2ms ... Begin
5ms ... About to stringify
16ms ... Stringified
0ms ... About to call yj.parseAsync
16683ms ... Called yj.parseAsync
*heartbeat*
*heartbeat*
22245ms ... Finished parsing err:false data:true
*heartbeat*
*heartbeat*
*heartbeat*
...

It demonstrates two big problems (which I believe are possibly unrelated):

gireeshpunathil commented 1 year ago

@teetering - thanks, I am able to reproduce the issue you reported. stay tuned, I will see what going on.

JohnBatts418 commented 1 year ago

Hi, Just came across this and hit a similar problem. Is there any update on this?

gireeshpunathil commented 1 year ago

haven't completed my analysis. expect some response in a week's time.

gireeshpunathil commented 1 year ago

an intermin update:

the JSON object in the given test case had a string which is more than 800000 characters long among other things. This is the string which is taking 99% of the time in function normalizeUnicodedString

https://github.com/ibmruntimes/yieldable-json/blob/5809572f1957a76e6c6137ca2b9fd5640b96eb0d/yieldable-parser.js#L67

when I replaced that with a small token, the test completed in less than a second.

the purpose of this function is to parse an incoming string and make sure any escape characters are properly accommodated. Looks like this function can be optimised.

I will study the function to see what can be done better. Meanwhile, feel free to propose PRs if you have ideas!

gireeshpunathil commented 1 year ago

@JohnBatts418 - can you share your test case too? I want to examine and confirm that your issue is same as this one.

latish commented 11 months ago

Running into the same issue of very slow performance when JSON object has a really long string in it.

gireeshpunathil commented 11 months ago

thanks for the update. right now running short on bandwidth; PRs welcome!

lukas8219 commented 3 months ago

@teetering In which machine are you running this benchmark? I am willing to spend sometime on this issue in the next month but in my machine it takes 2seconds. I wanna simulate in a CPU constrained environment.

The event-loop blocking I am able to repro tho

teetering commented 3 months ago

@teetering In which machine are you running this benchmark? I am willing to spend sometime on this issue in the next month but in my machine it takes 2seconds. I wanna simulate in a CPU constrained environment.

The event-loop blocking I am able to repro tho

Thank you for looking! After your message, I tried with two different hardware architectures and got vastly different performance results, which I guess explains why you don't see the same as me.

My very bad results above are on Ubuntu 22.04.4 LTS (Linux 6.5.0-44-generic x86-64) running on an Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz (which is typically quite decent).

On the other hand, on a Mac M1, I'm getting much better performance (though, as you say, still with problems blocking the loop):

2ms ... Begin
5ms ... About to stringify
7ms ... Stringified
0ms ... About to call yj.parseAsync
2208ms ... Called yj.parseAsync
*heartbeat*
*heartbeat*
3103ms ... Finished parsing err:false data:true
*heartbeat*
*heartbeat*
lukas8219 commented 3 months ago

This is very interesting.

So per my investigations, reinforcing what @gireeshpunathil said, the problem comes from the normalization. But mostly, it seem like a hot-loop calling indexOf into huge strings. I am facing couple issues with consistency as the current Unit Test don't catch all the cases.

But thanks from letting me know about the architecture specificities.

My PC is also an MacBook - M3 Max Pro and I see the exact 2-3seconds delay you just posted @teetering

I'll first try to fix the event-loop being blocked. And then I'll try to look for AMD specific issues (which then seems really tough)

Again, just reinforcing this is an open source contribution - so I am not promising anything 🙏🏻 rather trying to help - as this is also similar to our use cases

lukas8219 commented 2 months ago

Update:

Got the fix working - by naively splitting the indexOf to a by-chunk generator style. It does decrease performance but manages to unblock the event-loop.

Next steps are mainly checking current changes in AMD/Intel CPU's and ensuring the performance bottlenecks are the same.

For reference:

The parse-test folder shared - executes in total 4.8s in my local machine. With the eventloop fix, it takes 5.5-6s. I am assuming it could be much faster if the intensity parameter is bumped - but again that would hit eventloop.

gireeshpunathil commented 2 months ago

@lukas8219 - thanks for pursuing this and coming up with a solution. I will continue watching the PR, let me know when it is in a reviewable state.

lukas8219 commented 2 months ago

@teetering Can you share which Node version you were running when you did both benchmarks? Intel vs Apple Chip/ARM? I've only ran the benchmark using virtualized environments (ie LimaCTL for MacOS) and I am not seeing the issue using Node v20.16.0 with Ubuntu AMD.

Using this images

- location: "https://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-amd64.img"
  arch: "x86_64"
- location: "https://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-arm64.img"
  arch: "aarch64

I'll have to spin up an VM for to properly benchmark this - but I am afraid this might be much more related to Node internal API's.

Performance was not great - but still around 7/8s - both using master and my code

teetering commented 2 months ago

@lukas8219 I am testing on v20.16.0 ...

lukas8219 commented 1 month ago

Got reproable results in a non-VM using GCP. My code fixed the problem but costed ~2 seconds more. Let's wait for @gireeshpunathil thought for the trade-offs so I can continue cleaning up the code and settling up the PR

gireeshpunathil commented 1 month ago

@lukas8219 - pls continue the cleanup work and undraft the PR when you feel it is ready for review. given that it has fixed the original problem, I would deem that the tradeoff is acceptable - we could find ways to eliminate that too, in future PRs!

lukas8219 commented 1 month ago

@gireeshpunathil The PR is ready for review!