Closed dcervelli closed 4 years ago
Wow, quite a subtle bug — thanks so much for sharing! I wonder if #89 would help in that case instead of using TextDecoder
(which is much slower last time I checked)?
Nope, push
+ join
also crashed with OOM on the repro above (with n = 27). ☹️
I did some benchmarking of TextDecoder
here: https://jsperf.com/textdecoder-vs-readutf8/7
Seems like readUtf8
is much faster for small payloads (i.e. no startup cost) which seems pretty important for Mapbox, but becomes radically slower for larger payloads (likely because of the V8 behavior). The crossover point is in the neighborhood of 32 for my machine.
I will probably do something like this:
function readUtf8(buf, pos, end) {
return end - pos < 32 ?
originalReadUtf8(buf, pos, end) :
textDecoder.decode(buf.subarray(pos, ed)
}
@dcervelli thanks a lot for your suggestion! It worked perfectly, although we found the sweet spot to be smaller (~12 characters), which fits with Benedikt's hint here.
In particular, this fixed a huge memory leak in both Chrome & Firefox in GL JS where in certain rare cases (with lots of long strings in tiles), worker memory footprint would grow to gigabytes within seconds, whereas with this patch it's stable and stays below 20MB.
@dcervelli One more note — the reason why the inflection point was 32 rather than 12 is that the benchmark used ASCII-only characters; once mix in some other letters (Cyrillic etc.), the perf characteristic is very different. I think we should go with 12 so that performance is stable for all languages, and I'm also good with accepting some decoding overhead for ASCII strings of length 12-32 if it helps avoid memory leaks like the one in https://github.com/mapbox/mapbox-gl-js/issues/8471. cc @ahk
readUtf8
is susceptible to this V8 bug and uses too much memory.Here's an illustrative repro you can run in the Chrome console:
You can then take a memory profile and find the
repro
variable in the "Containment" view taking up 2Mb instead of 64Kb.Note the part about flattening: in our case, we weren't accessing the array so the large memory usage becomes pinned in the heap. If you access the string, it becomes flattened and the memory spike will be transient. source
We will workaround the issue by using this code in our fork:
But that won't work with IE, so you may want a polyfill.
Further reading: https://github.com/catapult-project/catapult/issues/2051#issuecomment-188805756