Closed paulirish closed 7 years ago
Working on this pretty soon 👍
Related work: https://codereview.chromium.org/2755943002
And here's the DevTools BalancedJSONTokenizer
and _writeBalancedJSON
i mentioned.
Is there any more tips you've got that might help me in a trivial way? Also, thank you so much so far!
our existing json.parse() for the trace is in driver.js that's the one we need to replace.
presumably we'd parse each chunk as its comes in from the IO.read stream.
i think it's best for us to start with an npm module for the incremental json parsing. we don't have a real node stream, so we'll need a library that okay with that. also we want it to have minimal runtime dependencies/
Gotcha!
I spent some time today looking at the available options (I was working on a separate problem, but it intersected streaming JSON parsing, so it was worth a look :)
Right now I think the BalancedJSONTokenizer
and _writeBalancedJSON
mentioned in https://github.com/GoogleChrome/lighthouse/issues/1798#issuecomment-301934352 are the way forward. The options out there are kind of overkill, and they don't seem particularly effective at the relatively simple case that we do have here (we don't need to parse any JSON, just the timeline/trace format).
BalancedJSONTokenizer
is nice in particular because it a) discards the chunk string after it's been parsed and b) it just uses JSON.parse
on that chunk, not requiring custom parsing character-by-character.
It is optimized for a few particular formats and could struggle with others, but one of those particular formats is exactly what we want to parse :)
This was all fairly cursory, however, so I'd be interested in hearing about more options if others have looked into this.
I've been trying to reproduce the bug using lighthouse, lighthouse CLI, clone of lighthouse against my clone of chrome, but I don't get this error. Do you mind walking me through how to reproduce this? Assume I've got everything set up. A screenshot would be great. @brendankenny do you mind me working on this? Good First Contribution :)
@ev1stensberg unfortunately this is one I've only seen once or twice myself, and haven't been able to reliably reproduce, so it's more of something we know is happening for some people and (think we) know why it's happening, so hopefully can just fix and watch the number of error reports drop :)
One option for testing/development is creating an artificial test case, taking one of the test traces (e.g. this one) and increasing the number of traceEvents
in it until it breaks JSON.parse
. The result can be used for developing the incremental parser, which we can verify parses JSON and works for very large JSON, and then cross our fingers that it starts fixing #1685 and #1733.
do you mind me working on this? Good First Contribution :)
It's great to work on this! I was just looking at a related problem today (OOM errors when saving traces to disk) and want to make sure I don't step on your toes. Can you share what you're thinking?
I've really just hovered around the source code, but using the suggestions from you & @paulirish is enough to convince me. Also, this is internal, so we are avoiding extra dependencies, which is good. Stepping on my toes are alright.
It's great to work on this! I was just looking at a related problem today (OOM errors when saving traces to disk) and want to make sure I don't step on your toes. Can you share what you're thinking?
Do you have a crbug on this that I could have a look at? Also, my guess is that our work isn't colliding extensively, as this isn't a very novel & code extensive exercise?
@brendankenny seems people have reported this has been resolved in v2
. I want to work on this still though, but we should try to get a test case up. Do you have time to exchange a few emails to get me started? I'm eager to help out, and also this interacts with a project that is starting in June ( hopefully )
heya @ev1stensberg.
brendan's made some more headway here so i think it makes sense for him to finish off the streaming json parser/writer. shouldn't take him too long.
i'm looking at our issues and #2320 and #2291 could work for your next issue. wdyt?
Sounds good. Will start on this tomorrow. Is there anyone I can get some help from regarding testing the source code against canary/master? Tried w/ no luck, would be good with some feedback on my walkthrough to see if there's anything I'm doing wrong :)
Details are here: https://github.com/GoogleChrome/lighthouse/issues/1685#issuecomment-
We can use the existing incremental JSON parser that's in devtools (see
WebInspector.TimelineLoader
andWebInspector.TextUtils.BalancedJSONTokenizer
.)Or alternatively we pull in a new npm module for that.
This would fix #1685 and #1733