Help wanted: Bi-directional text and Arabic shaping support

nathansobo commented 6 years ago

Our text rendering strategy cuts corners to achieve maximal performance in a web-based environment. We render glyphs to a texture atlas and then composite them on the GPU with a texture-mapped quad representing each character on screen. We currently composite characters with a naive strategy, just laying them out one after another from left to right.

Bi-directional text

Xray is a code editor, and most programming languages assume a left-to-right layout. We're not interested in supporting authoring an entire document with right-to-left lines. For to support programmers working in right-to-left locales, we do want the ability to embed strings of right-to-left text within left-to-right text, such as within strings and comments. This is called bi-directional text.

Interestingly, text in right-to-left languages is still stored from left-to-right in strings. There's an official algorithm defined by the Unicode standards body for transforming fragments of a string from this left-to-right "logical order" to the order the string should be displayed on screen. Anyone that wants to take on this issue will probably need to wrap their heads around the basics of how it works.

Arabic shaping

When we tested using HarfBuzz to perform a general text-shaping pass on every line, we ran into a couple problems:

HarfBuzz is written in C, and it's not currently possible to compile a single WebAssembly module that contains both Rust and C/C++. At least we couldn't figure it out. Let us know if we're wrong.
When we compiled HarfBuzz to its own module and tested running strings through it, it was taking between 4ms and 20ms to process 50 lines of 100 characters each, depending on the font.

Even if we solve these issues, there's still the issue of rendering glyphs. We currently render glyphs via an HTML5 canvas. In Electron, this simplifies a lot of cross-platform headaches related to loading fonts, dealing with fallbacks, etc. For the WebAssembly case, this is our only option unless we want to render everything via FreeType, which seems extremely complex.

However, canvas isn't actually capable of rendering glyphs. It can only render strings. So even if HarfBuzz or a platform-specific text shaping library returns a list of glyphs to render, we can't actually render an arbitrary glyph with canvas. Some glyphs in some languages are only accessible via context-specific combinations of letters.

Long story short, skipping generalized text shaping saves us a lot of complexity and performance overhead in the common case of editing code. What does it lose us?

Support for rendering Arabic scripts
Support for rendering Indic scripts

Both of these languages rely on contextual alternates. For example, Arabic characters can correspond to one of four different glyphs depending on their position within a word. Indic scripts combine syllables into individual glyphs in ways I don't really understand.

Indic script support is out of scope for this issue, but it turns out we can support Arabic without doing full-blown text shaping and font rasterization. Unicode happens to define "presentational" code points, which each represent a single contextual variant of a character. This great article from the Mapbox GL JS team describes how a normal string containing Arabic can be transformed to use these presentational forms, which we can then rasterize with our naive character strategy without issues.

What a solution might look like

Assuming we can't figure out how to compile C and Rust in the same WebAssembly module, we'd like a pure Rust solution that takes care of the bi-directional text transformation and the substitution of Arabic presentational characters in the output string.

There seem to be several implementations of "arabic shaping" floating around on the internet that could serve as inspiration, though a lot of them are GPL-licensed, so be careful about derivative works since Xray is MIT-licensed.

An additional challenge is that it won't be sufficient to simply transform the text. We'll also need to include enough metadata to understand where the cursor should be rendered inside a line containing bi-directional text and Arabic substitutions. I haven't thought deeply about how this mapping would be structured, but we essentially need to efficiently translate back and forth between columns in the input text and the output text.

A great solution would also contribute some kind of documentation explaining what you learn researching this problem.

khaledhosny commented 6 years ago

Use for presentation forms to shape Arabic is deprecated and does not cover all languages written in the Arabic script (new Arabic characters encoded in Unicode do not have corresponding presentation forms encoded), it also does not handle mark positioning which is very important for readable Arabic rendering.

nathansobo commented 6 years ago

@khaledhosny Thanks for adding to this conversation as this is a subject where I could use help. Sounds like we will need to find another approach. I’m wondering then if we could take the same strategy for Arabic and Indic (on the web at least) and just render the entire line via canvas, measure via HTML, and then upload the entire line to the graphics card. Might be too slow. This is tricky.

nathansobo commented 6 years ago

Maybe we should just render any line containing text that can’t be rendered via the simple layout strategy as HTML superimposed over the canvas. It adds complexity and it may be challenging to blend it in perfectly, but it would be the easiest way to apply the underlying capabilities of the platform where they are essential to legibility. Worst case scenario we end up with the performance of other web based editors if every line requires sophisticated shaping.

khaledhosny commented 6 years ago

Text layout, editing, selection, etc. are pretty hard things to do in an internationalized way and browsers had decades to perfect this (and some might say they are not there yet). So if one is to re-do this from scratch, it has to be planed properly instead of making shortcuts that tend to fire back sooner or later.

quininer commented 6 years ago

This will help? https://github.com/servo/unicode-bidi

nathansobo commented 6 years ago

@quininer I guess that could be a start for supporting right-to-left scripts that don't depend on contextual alternates, but if @khaledhosny is correct about presentational forms not being a viable solution, we may want to handle all lines that contain bi-directional text via some escape hatch that just uses the browser's built-in rendering.

molikto commented 6 years ago

I am currently researching doing a GUI framework in Kotlin (and eventually port to Kotlin Native), most work can be offloaded via JNI to open source libraries (Skia, HarfBuzz, https://github.com/HOST-Oman/libraqm). So I think the current missing pieces are cross platform font selection https://github.com/servo/servo/issues/4901, and IME support is WIP https://github.com/glfw/glfw/issues/41

Assuming these are done. Then I see no real reason to base a text editor on web technologies. Web just don't have the API for querying glyph layout information. Kotlin Native can be a good candidate to write a text editor in.

If interested, I might give it a try for the first problem.

atom-archive / xray