w3c / csswg-drafts

CSS Working Group Editor Drafts
https://drafts.csswg.org/
Other
4.46k stars 658 forks source link

[css-fonts-4] Extend font-language-override to override OpenType "script" as well #3178

Closed faceless2 closed 6 years ago

faceless2 commented 6 years ago

Selecting features from an OpenType font requires the correct language and script to be set, and where the language in the font GSUB table doesn't match the language in the source document, we have "font-language-override" to force it. But there are two issues:

Why we need to override OpenType script

The PrinceXML sample document at http://css4.pub/2015/malthus/essay.html references the OpenType font "Adobe Caslon Pro", then turns on oldstyle numbers with what is effectively font-variant: oldstyle-nums discretionary-ligatures;. The document later contains a table cell containing just digits, <td>789</td>.

The problem: Adobe Caslon defines just one script/language combination: the "latn" script with the default language. Digits have the Unicode script [1] "Common", not "Latin", so there should be no matching script/langauge combination, and oldstyle numbers substitution should not occur.

That's clearly not the intention of the document author. The OpenType spec suggests flexibility on which script to choose when "Common" glyphs are mixed in with another script such as "Latin", which is fine if they're in the middle of a paragraph. However where they're in an isolated HTML element, determining which other script they might reasonably be is guesswork. Prince guesses right, but it's better to allow it to be specified explicitly, as we do for languages.

Current syntax

normal | <string>

Possible proposed syntax

Two options. Either add a font-script-override feature, or change font-language-override to:

normal | none | <string> [ normal | none | <string>]?

Some examples:

font-language-override: normal;  /* no overriding */
font-language-override: none;  /* override language to opentype default */
font-language-override: "URD";  /* override language to Urdu*/
font-language-override: "latn";  /* override script to Latin */
font-language-override: "KUY" "lao";  /* override language to Kuy, script to Lao */
font-language-override: "lao" "KUY";  /* same */
font-language-override: normal none; /* override script to Common */
font-language-override: none normal; /* override language to opentype default */

In the PrinceXML example described above, we could easily work around the lack of the recommended "default" script entry in the font by doing

body {
    font-variant: oldstyle-nums discretionary-ligatures;
    font-language-override: "latn";
}

[1] https://unicode.org/reports/tr24/ [2] This should read "uppercase or digits" and not neccesarily three letters: https://github.com/w3c/csswg-drafts/issues/1104

faceless2 commented 6 years ago

Pulling this. Needs more thought.

svgeesus commented 6 years ago

Okay, but this does sound interesting so please re-open when you have worked on it some more!

faceless2 commented 6 years ago

Just for posterity, I pulled this because neither Firefox, Chrome, Safari nor Prince agreed with my assessment. As inconceivable as it seems to me, I was clearly missing something.

It's that Harfbuzz (the OpenType layout engine which I suspect underlies all these products) will choose "latn" as the script of choice if the font has no "dflt" script. hb-ot-layout.cc line 1345: try with 'latn'; some old fonts put their features there even though they're really trying to support Thai, for example :(. Harfbuzz was tested against MS Uniscribe, so it's pretty much correct-by-definition regardless of what the (slightly sloppy) OpenType spec says.

Theoretically there's still a need for this, or something like it. But I presume "font-language-override" came about due to an actual case where it was required. I no longer have one of those to back up this issue, so it's probably best I leave this one closed.

svgeesus commented 6 years ago

If the OpenType spec is unclear, I suggest sending a message describing the issue (and ideally, suggested wording for a clarification) to the MPEG SC29/WG11 ad-hoc group which maintains it. (Ping me if you don't have the address).