google / trace-viewer

https://github.com/catapult-project/catapult/tree/master/tracing#readme
487 stars 84 forks source link

256MB file limit #627

Closed dvyukov closed 9 years ago

dvyukov commented 9 years ago

On some html files (result of trace2html) chrome says "Import error: invalid string length". After some painful debugging I've figured out that it's probably a 256MB limit on json file somewhere. When I try to open it in firefox, it just says "TypeError: document.registerElement is not a function trace.html:4258", so I can't test what's the limit in firefox. Does it sound plausible? If so it would be nice if trace2html would fail with a reasonable error message.

dj2 commented 9 years ago

I looked into this a few weeks ago, and apparently didn't open a bug. So, the issue is, there is a max string length in v8. When we build the string that is the JSON data, we exceed that string length and v8 throws an exception.

The solution we were thinking of to handle this is that the trace2html would have to output a zip file with multiple chunks of the JSON. We'd then load each chunk, parse the JSON and then build the events array from that which is then passed to the importers.

@skyostil @natduca

dvyukov commented 9 years ago

The solution we were thinking of to handle this is that the trace2html would have to output a zip file...

That would be great, as I hit the limit on real data.

My json's are huge in part because I attach 2 large strings to every slice, namely start and stop stack trace (where a thread has been resumed and where it is blocked again). There a lots of dups between then stacks.

dj2 commented 9 years ago

If you're looking for stack trace data, I know there has been work to integrate perf into trace-viewer. Would using those stacks work for you? I believe @skyostil could give more info on the current status.

dvyukov commented 9 years ago

I don't fully understand what you propose.

What would work is: (1) an ability to reference a single string from several events, then I don't need to duplicate the same string with a stack trace lots of time or (2) an ability to explicitly associate stacks with slices by some id. that is add a special attribute to slices with stack id, and store stacks themselves somewhere else. Ideally I need 2 stacks per slice -- stack of start and stack of end.

skyostil commented 9 years ago

It sounds like you may be able to leverage some of our perf work. Unfortunately I don't think we documented the exact format anywhere, but the basic idea is that the trace has a dictionary of stack frames. Each frame has a name, an id and a reference to its parent. The slice itself can then just reference the leaf stack frame and you can reconstruct the full stack from there.

Having a separate stack frame at the beginning and end may need some work in the viewer side, but I think it should be doable.

See here for the actual patches: https://code.google.com/p/chromium/issues/detail?id=375754

dvyukov commented 9 years ago

Is it Samples/Stack Traces section of: https://docs.google.com/document/d/1CvAClvFfyA5R-PhYUmn5OOQtYMH4h6I0nSsKchNAySU/edit#heading=h.b4y98p32171 ?

Do you mean that I just add fake profiling samples for beginning and end of each slice? Would it be possible to select slice and the sample separately?

skyostil commented 9 years ago

Yes, that's the thing I was talking about. You could add fake samples, but the UI wouldn't probably be ideal because the samples are on a separate track from the normal trace events because samples are expected to be statistical instead of a definite hierarchy.

Maybe there's a way to construct a more traditional looking slice hierarchy from the samples in tracing. I'm not sure what the best way to visualize that data would be.

natduca commented 9 years ago

This bug is idle. I think changing the file size limit is off the table. Should we retitle this to be "unclear how to solve in trace viewer?"

dj2 commented 9 years ago

I think the title is fine. We will need to solve this at some point as I've run up against this issue. It would be solvable in chrome but outputting a .zip instead of a .json and then break the .json into several smaller files which we then parse ... somehow ... into the trace structures.

Doing it through trace2html is possibly harder, or maybe easier? Instead of writing a single giant JSON string, write several smaller ones that we also assemble through the magic method above?

In either case, I think it's doable, but I also think it's going to be hard.

dvyukov commented 9 years ago

Hi,

Any updates on this?

I've submitted tracing functionality into Go mainline. And I think this will be the most popular issue among Go users.

Can the resulting .html read the trace in streaming mode? I could feed it through http, for example.

dvyukov commented 9 years ago

Do I understand it correctly that trace2html packs a bunch of static files together and also embeds the json into the resulting file? If so it would be super awesome for us (super-super awesome), if we could pack the static part into an html, and then run it as: trace.html?trace=http://localhost:12345 and serve json on the address.

Is it doable? I know nothing about your system, but it looks like it may be relatively easy to do.

I don't know whether it will help with the json size limitation or not. But maybe it will help, as you will not need to have json in a js string at all.

natduca commented 9 years ago

https://github.com/google/trace-viewer/wiki/Embedding may be useful... you definitely can turn trace viewer into a standalone.

As far as fixing the actual limit, I'm a bit unsure the right fixing strategy. I know that if we could agree on a plan, we'd be happy to mentor you in writing a fix for it... On Thu Feb 05 2015 at 4:52:49 AM Dmitry Vyukov notifications@github.com wrote:

Do I understand it correctly that trace2html packs a bunch of static files together and also embeds the json into the resulting file? If so it would be super awesome for us (super-super awesome), if we could pack the static part into an html, and then run it as: trace.html?trace=http://localhost:12345 and serve json on the address.

Is it doable? I know nothing about your system, but it looks like it may be relatively easy to do.

I don't know whether it will help with the json size limitation or not. But maybe it will help, as you will not need to have json in a js string at all.

— Reply to this email directly or view it on GitHub https://github.com/google/trace-viewer/issues/627#issuecomment-73041208.

dvyukov commented 9 years ago

Re Embedding: aha! nice! trying now

natduca commented 9 years ago

Lets do https://github.com/google/trace-viewer/issues/752 to deal with size issues... and maybe we should make a fresh bug about the 256mb limit that summarizes what we know so that if we really do address that limit, we can do so later?

dj2 commented 9 years ago

I think we can just leave this bug open as the 256 meg limit bug. No reason to create another and lose the discussion here.

natduca commented 9 years ago

My comment is that a lot of the conversation is obsolete. The remaining work is just the string parsing. If we had a person driving through looking for bugs, it wouldn't leap out at them what work needs doing. We could I suppose delete the parts of the thread that are obsolete... but thats funky...

dj2 commented 9 years ago

Ah, I assume anyone that picks up the bug will, hopefully, read the comments and when we say the stack part of this is done and we need to figure out how to deal with importing large traces into trace-viewer they''ll see that.

But, if you feel like it, you can open a new bug and link to this one for some context.

catapult-bot commented 9 years ago

Migrated to https://github.com/catapult-project/catapult/issues/627