firefox-devtools / profiler

Firefox Profiler — Web app for Firefox performance analysis
https://profiler.firefox.com
Mozilla Public License 2.0
1.14k stars 372 forks source link

WIP: introduce compressedArrayElement for profile data #5012

Open vvuk opened 1 month ago

vvuk commented 1 month ago

Much of the profile json coming from something like samply is taken up by arrays containing an identical value. samply even has serialization helpers to create arrays of length N with just null, 0, etc. in order to match the profile file format. These values take up a lot of space in the final json; up to 30% in some profiles, and the values are completely unnecessary.

This draft PR shows one approach to how the format could be changed to allow for arrays that all contain the same value to be replaced with just the value itself. In other words:

  lineNumber: [null, null, null, .....]

becomes:

   lineNumber: null

I don't actually like this implementation, but opening this PR to get feedback. The problem is that the profiler front end code uses the JSON data format as the in-memory data format directly, so every single location that reads from one of these arrays would need to be replaced with a call to compressedArrayElement which is very error prone. I've only captured a few of these locations here.

I think a better approach would be to introduce a type that acts like an array, but contains a _value that's either an array or a single value and then as part of parsing the profile, run through the post-JSON.parse data structure and convert these, e.g.:

  functionTable.lineNumber = new CompressedArray(functionTable.lineNumber);

CompressedArray would also have a toJSON implementation to serialize the value-or-array.

mstange commented 1 month ago

If we want to change the in-memory representation, then I would prefer the CompressedArray approach, but it'll be more invasive.

If we mostly care about reducing the size of the string that we pass to JSON.parse, when we could introduce more differences between the "serializable" format and the in-memory format - this distinction already exists but at the moment it only affects the stringTable / stringArray:

https://github.com/firefox-devtools/profiler/blob/720a43bc96840198ff84eb77c6554f5633378876/src/profile-logic/process-profile.js#L1720-L1764

I think that would be a lot less invasive - we'd convert the "compressed" representation into the uncompressed representation in _unserializeProfile and none of the rest of the code would need to know about the compressed representation.