I think useful variants to the utf8 and utf16 functions are versions that output a brks array that is per code-point, instead of per code-unit as the current ones do. That is, the same results that come out of the utf32 version, but allowing for utf8 and utf16 input. In some situations I want to be able to consume the output 'brks' without having to think about what my source encoding was.
Seems simple to add, just an if around the loop that increments posLast and sets LINEBREAK_INSIDEACHAR, and instead just increment posLast once per iteration.
I think useful variants to the utf8 and utf16 functions are versions that output a brks array that is per code-point, instead of per code-unit as the current ones do. That is, the same results that come out of the utf32 version, but allowing for utf8 and utf16 input. In some situations I want to be able to consume the output 'brks' without having to think about what my source encoding was. Seems simple to add, just an if around the loop that increments posLast and sets LINEBREAK_INSIDEACHAR, and instead just increment posLast once per iteration.