servo / font-kit

A cross-platform font loading library written in Rust
Apache License 2.0
693 stars 104 forks source link

Looking up CSS generic families #100

Open SimonSapin opened 5 years ago

SimonSapin commented 5 years ago

In CSS, the serif, sans-serif, cursive, fantasy, and monospace keywords when used without quotes have a different meaning than other keywords (which are space-joined into a single string) or quoted strings. They are "generic" font families.

Currently Source::select_family_by_generic_name allows looking up concrete font families from those keywords, but it is #[doc(hidden)] with a FIXME comment about returning multiple families.

In the specification:

https://drafts.csswg.org/css-fonts/#generic-font-families

All five generic font families must always result in at least one matched font face, for all CSS implementations. However, the generics may be composite faces (with different typefaces based on such things as the Unicode range of the character, the language of the containing element, user preferences and system settings, among others). They are also not guaranteed to always be different from each other.

https://drafts.csswg.org/css-fonts/#font-style-matching

User agents may choose the generic font family to use based on the language of the containing element or the Unicode range of the character.

Based on other uses, the spec appears to misuse "character" to mean Unicode code point. But then “When text contains characters such as combining marks” there’s specific handling based on grouping by grapheme cluster.

So perhaps Source::select_family_by_generic_name should be unhidden, but after some API change. Would an input "character" (perhaps a char code point, or a &str grapheme cluster?) be useful to any of the backends? What about an optional language tag?

pcwalton commented 5 years ago

My gut tells me that having a "hint" structure could be useful as a parameter to the font selection methods. This hint structure could optionally include an &str EGC and a language tag. I like the name "hint" because it strongly implies that we can modify the algorithm over time, since the specs are clearly imprecise right now. It will be fine if Servo relies on the details of the matching algorithm of this "hint", but calling it a "hint" suggests to other users of font-kit that they should not rely on these details, because they're subject to change as the CSS and/or Unicode specifications evolve.

How does that sound?