linebender / skribo

A Rust library for low-level text layout.
Apache License 2.0
326 stars 36 forks source link

Add rough first draft of script matching doc #1

Closed raphlinus closed 5 years ago

raphlinus commented 5 years ago

Rendered.

I'm very open to ideas regarding open questions and TODO items.

raphlinus commented 5 years ago

@jfkthame I'd love to get your input on this, even if just a few pointers of where to look in the Gecko codebase.

RazrFalcon commented 5 years ago

As a simple note, we can't use only the fontconfig on Linux, because KDE doesn't modify the fontconfig's config by default. Instead it will store it's font settings in ~/.config/kdeglobals. So we have to handle this too.

This is how Qt does this.

jfkthame commented 5 years ago

For Gecko, see gfxFontGroup::FindFontForChar, which in turn will call into WhichPrefFontSupportsChar and WhichSystemFontSupportsChar.

Note that the structure of the Gecko font preferences is pretty ancient, with roots in the old world of multiple 8-bit and double-byte codepages for different "language groups", and could really use an extensive rewrite...

raphlinus commented 5 years ago

@jfkthame Thanks, that's useful, but I find myself still mystified by where, in particular, Han unification logic happens. It seems like it should be in WhichSystemFontSupportsChar (as that takes an aRunScript argument), but when I drill down, I can't find any actual Han unification logic: GetCommonFallbackFonts seems not to cover CJK (except for plane 2 astral), and PlatformGlobalFontFallback seems to just drop aRunScript. I can keep digging, but maybe you know off the top of your head?

jfkthame commented 5 years ago

You probably want to look at WhichPrefFontSupportsChar, as that's where whichever CSS generic is applicable will be mapped to a font family from the (user-configurable) prefs. It'll look up a "unicode range" for the character, and then map this through gfxPlatformFontList::GetFontPrefLangFor and gfxPlatformFontList::GetLangPrefs to determine which set of prefs to use.

So in most cases, if a CJK font hasn't been explicitly named, this is where it'll get selected. Only if the font specified via prefs doesn't cover the character in question will we end up in WhichSystemFontSupportsChar.

raphlinus commented 5 years ago

Ok, that's helpful, though I've got to say it's not easy to figure out what's going on from reading the code.

However, having come across implement font cascading for system fonts under OSX, it seems like this might be the answer I'm looking for: CTFontCopyDefaultCascadeListForLanguages. That linked bug identifies a few problems with the approach, but I'm wondering whether I should be pursuing this or trying to replicate what Gecko does.

And after a little more digging, I found the source of truth for that: lang-tags in the font.name-list settings, with a "hardcoded" list of fonts from platform-specific #ifdefs in libpref/init/all.js. It's not really hardcoded because these can be changed, but I think it's a good bet that 99.99% of users won't touch those. This raises the requirements question of whether this needs to be configurable, or whether we can count on font-kit to get the correct information out of the system.

raphlinus commented 5 years ago

I've added significant new content, based on investigations of Gecko and Qt. This certainly seems to be a complicated problem domain. Again, feedback is welcome!