Closed ChrisLoer closed 6 years ago
Oh also this is meant to be backwards compatible with gl-js in both directions. Old versions that load this should just ignore processStyledBidirectionalText
. New versions of gl-js that load an old version of the rtl-text plugin will still be able to run BiDi for unstyled text, but won't run it for styled text (see shaping.js
logic at https://github.com/mapbox/mapbox-gl-js/pull/6994).
It would presumably take less space, though, right? Not sure if the difference would be enough to matter...
Yeah, I'm going to start with the assumption that it doesn't matter. We're not holding onto these arrays, so the cost would just show up... I dunno, maybe with the wasm version of mapbox-gl-rtl-text the overhead of transferring the array would be greater? But it seems reasonable to ignore unless it actually shows up as significant in profiling.
Huh, ICU's download site appears to be down, which breaks the build. I'll try again later.
processStyledBidirectionalText
that takes an array of style indices in parallel to the input text and returns results annotated with the correct (reordered) indicesprocessStyledBidirectionalText
that that splits contiguous input style sections into discontinguous output sectionsNote the test strings in arabic.test.js are mind-bending and may break layout in your text viewer (that includes GitHub's preview: those arrays that look broken aren't).
The previous approach (which we still use for unstyled text) was:
setLine
with chosen line break points to prepare individual lineswriteReordered
to correctly transform/reorder all logical runs within the line into visual runs.The new approach for styled lines is:
setLine
with chosen line break points to prepare individual linesgetVisualRun
to iterate over directional runs within a line in visual order. For each run:writeReverse
to reverse the section and apply transformations. This may change the length of the segment, but since the whole segment shares the same style, we can just apply that style to every character in whatever the reversed output is. Note thatwriteReverse
is responsible for keeping combining characters together in the right (aka still "logical") order. If you changed style mid-combining character, this algorithm could move the combining character to the wrong place, and you would deserve whatever happened to you.One open question in my mind is whether the style annotation interface makes sense (an array of annotations maintained in parallel with a flat string). Alternatively, we could represent the input in terms of sections (aka instead of representing a section "foo" with style 0 as
["foo", [0, 0, 0]]
, we could represent it as["foo",0]
). Internally, it was easier for me to work with a data structure that made it super-easy to go from "source string logical index" -> "style", but it'd be easy enough to do the conversion. The thing is, I'm not sure an annotated-per-section interface is actually easier to work with, because the caller then has to keep track of iterating over sections too, and it's confusing that a contiguous input section can actually be split up into multiple discontiguous output sections.../cc @anandthakker