Open vvuk opened 1 month ago
If we want to change the in-memory representation, then I would prefer the CompressedArray
approach, but it'll be more invasive.
If we mostly care about reducing the size of the string that we pass to JSON.parse, when we could introduce more differences between the "serializable" format and the in-memory format - this distinction already exists but at the moment it only affects the stringTable
/ stringArray
:
I think that would be a lot less invasive - we'd convert the "compressed" representation into the uncompressed representation in _unserializeProfile
and none of the rest of the code would need to know about the compressed representation.
Much of the profile json coming from something like
samply
is taken up by arrays containing an identical value.samply
even has serialization helpers to create arrays of length N with just null, 0, etc. in order to match the profile file format. These values take up a lot of space in the final json; up to 30% in some profiles, and the values are completely unnecessary.This draft PR shows one approach to how the format could be changed to allow for arrays that all contain the same value to be replaced with just the value itself. In other words:
becomes:
I don't actually like this implementation, but opening this PR to get feedback. The problem is that the profiler front end code uses the JSON data format as the in-memory data format directly, so every single location that reads from one of these arrays would need to be replaced with a call to
compressedArrayElement
which is very error prone. I've only captured a few of these locations here.I think a better approach would be to introduce a type that acts like an array, but contains a
_value
that's either an array or a single value and then as part of parsing the profile, run through the post-JSON.parse
data structure and convert these, e.g.:CompressedArray
would also have atoJSON
implementation to serialize the value-or-array.