yeslogic / allsorts

Font parser, shaping engine, and subsetter implemented in Rust
https://yeslogic.com/blog/allsorts-rust-font-shaping-engine/
Apache License 2.0
706 stars 23 forks source link

Ligatures #24

Closed Kethku closed 4 years ago

Kethku commented 4 years ago

I am working on a code editor and I'm interested in using Allsorts to parse font files and build up a list solely made up of the ligature type subsitutions in the font file. These are particularly important in monospace rendering of latin scripts where other types of substitution are not.

My hope is that by only concerning myself with ligatures and manually applying ligature substitutions in my rendering code, I can speed up the normally very slow operation of text shaping.

Unfortunately I'm struggling to orient myself in the Allsorts codebase and find a way to extract the ligature list out of a given font file. Can you give me some quick pointers to get me on my way?

(Awesome work by the way. I'm very hopeful this library will work for my use case)

wez commented 4 years ago

I'm not affiliated with this project, but I share an interest in monospace fonts and font shaping. I've partially implemented the shaping function using allsorts in my terminal emulator. It's not complete and probably not correct, but the code may help you figure out where you want to look in the allsorts code:

https://github.com/wez/wezterm/blob/master/src/font/parser.rs#L294

FWIW, I haven't found shaping to be a performance bottleneck in practice (using harfbuzz primarily), and because there are a lot of different tables and substitutions to consider when shaping text, I would advise against recreating that logic in your application unless you are (or are prepared to become!) a font expert!

Kethku commented 4 years ago

Interesting. Maybe I'm wrong, but I thought in a monospace environment (when it comes to latin based languages) ligatures were the only substitution I needed to worry about.

Your point that it might not be harfbuzz which is the performance block is an interesting one. I will do some deeper profiling to ensure it is the problem. I appreciate the hint.

Kethku commented 4 years ago

I have confirmed that not all, but a significant part of my rendering budget is spent on the cpu doing text shaping. I have some evidence: For the absurd test of shaping and rendering an arbitrary binary as unicode with a font size of 7 like so:

image

And rapidly scrolling the buffer, I get a flamegraph representing the time spent in each function that looks like this:

image

The red in this case is everything outside of text shaping while the blue is spent on text shaping during the period of the rapid scrolling. Clearly this is not the only area for improvement for me, but it is a place I want to improve none the less. I suspect this may still not be harfbuzz performance problems in general, but problems with text shaping in the skia wrapper I am using. In any case handling it myself opens the door for better caching and the like.

Further the skia harfbuzz wrapper does weird things sometimes like translating certain glyphs down a couple pixels for no visible reason. I suspect this is due to some interesting font positioning rules which I could bypass by doing the shaping myself which is pretty trivial in a grid.

wezm commented 4 years ago

@Kethku As Wez advised I suspect you're better off sticking with full shaping. To answer your question though, GSUB lookup subtable four contains ligature substitions. I suspect if you only implemented support for those substitutions you'd eventually hit fonts or text that doesn't work how you expect though. For example trying edit non-latin text in your editor.

Kethku commented 4 years ago

Sounds good. I appreciate the advice! I will close this as I have gotten useful pointers to move forward.