Substrings: slices vs. copies

The stringview_wtf16.slice operation creates a string S2 that's a substring of another string S1. V8 originally implemented this by reusing its existing machinery for creating string slices, i.e. S2 would only store a pointer to S1, a start offset, and a length, making the slice operation itself very cheap because it doesn't need to copy any characters, and making S2 very memory-efficient (especially when it's long, so that many characters are shared with S1).

Like almost any engineering technique, this approach is a tradeoff with pros and cons. In this case, its drawback is that a relatively short string (S2) can be keeping an otherwise-unreachable other string (S1) alive, which might be much bigger. We have just encountered a real-world use case where this happens a lot and has detrimental effects on memory consumption: user-space JSON parsing. Any string values in the decoded data keep the entire JSON source string in memory when they are represented as string slices. (Note that the specifics of the JSON format are irrelevant, the same would happen for any other string-based serialization format where string values occur as substrings of the serialized data, JSON just happens to be a very common case of this.) For comparison, JavaScript doesn't run into this problem, because the implementation of JSON.parse never creates sliced strings, because it can make a very reasonable assumption that the JSON source string is meant to be thrown away when the parsing operation is complete.

As a short-term mitigation, we have just changed V8's implementation of stringview_wtf16.slice to make full copies when the substring is at most half as long as the original string. A possible longer-term engine-side mitigation might be to build a tracking mechanism across the entire managed heap that figures out which long strings are being used by which slices, and can then decide whether converting these slices to copies would be overall beneficial or not. That's fairly heavyweight machinery for solving a rather limited problem though, so while I'd be excited to see this prototyped, I'm not convinced that it would actually be a reasonable choice for engines to spend the engineering hours to build such machinery and, more importantly, the runtime CPU cycles to operate it.

As a proper solution, it would seem reasonable to me to add a way to let Wasm modules explicitly control the desired behavior:

One option for this would be to have two variants of the slice operation: slice_allow_shallow and slice_force_copy (illustrative names, to be bikeshedded).
Another option would be to have an explicit "make this a simple/flat/direct string" operation. That could be more generic/flexible (e.g. it could also be used to flatten ropes resulting from concatenation), but also seems harder to spec/name, because it would only affect implementation internals, with no change to observable values. It might also be somewhat harder to implement, because for maximum efficiency engines would have to fuse it with the preceding operation (to avoid allocating a short-lived intermediate string). For optimizing compilers that's not a big deal, but not every engine wants to implement an optimizing compiler, and not every function always runs in optimized mode.
There may well be additional options to accomplish the same goal.

I don't feel strongly about how this will be solved, I'm just saying that I think it would make sense to have some improvement over the status quo.

Without new instructions, copies of string slices can be forced by converting the string to an array and then back to a string (e.g. using string.encode_wtf16_array + string.new_wtf16_array, or their wtf8 equivalents); engines could conceivably learn to recognize this pattern and shortcut it to avoid the double copy (but not shortcut it too much: if they avoided both copies, that would defeat the point). IMHO this would not be a pretty solution though; it feels similar to dubious JavaScript performance tricks like "temporarily install this object as the prototype of some other dummy object just to make the JS engine to switch its internal representation into a certain mode".

(Same discussion for imported strings: https://github.com/WebAssembly/js-string-builtins/issues/1)

WebAssembly / stringref

Substrings: slices vs. copies #63