stevengj / subsuper-proposal

Draft proposal for additional sub/superscript characters in Unicode
217 stars 9 forks source link

Provision of Fonts #7

Closed lambdafu closed 3 years ago

lambdafu commented 7 years ago

I am not sure this is all that relevant given that all proposed characters will be straight derivates from existing glyphs. Still, this is the corresponding issue from the text.

cormullion commented 4 years ago

@stevengj

I don't know how the Unicode Consortium decide matters, but I would guess that they want graphical implementations only for new/previously unconsidered glyphs; I don't think there's any mystery what a subscript lowercase letter "a" should look like. Also, perhaps they aren't too interested in the detailed design of the glyphs, judging from:

The shapes of the reference glyphs used in these code charts are not prescriptive. Considerable variation is to be expected in actual fonts. [The Unicode Standard, Version 13.0 Archived Code Charts]

If you require a font that has these proposed characters, for testing or evaluation purposes, you could consider the JuliaMono font which contains designs for superscripts and subscript for both Latin and Greek uppercase and lowercase alphabets, plus digits and some punctuation. They're all stored in a private use area (0xF0000 to 0xF0294.), some are also aliases to existing Unicode code points, but they can easily be moved to a better location one day (since a script specified their temporary code points).

In a text editor (viewed on a 5K Retina iMac display, YMMV), they look like this:

Screenshot 2020-09-02 at 14 40 56

From a design perspective, the main point of interest is you don't simply scale regular glyphs down to make subscript and superscript glyphs (top). Instead (bottom), you increase the stroke weight so as to more closely match the designs of the normal glyphs.

Screenshot 2020-09-02 at 14 51 56

and you can make other small design adjustments - enlarging counters for example - to allow for the decreased legibility.

(This is the same principle behind the design of Small Capitals:

Well-designed small capitals are not simply scaled-down versions of normal capitals; they normally retain the same stroke weight as other letters and have a wider aspect ratio for readability. [https://en.wikipedia.org/wiki/Small_caps] )

I noticed that there aren't many programming fonts that even support the existing subscript and superscript characters, so it's not easy to predict how many font designers will implement a full set once they're added to Unicode. A quick look for the subscript m and n characters gives:

Screenshot 2020-09-02 at 15 06 46

So basically, only DejaVu, Everson, Iosevka, Pragmata, and VictorMono have good support. Fonts like the ligature-rich Fira Code, Roboto, or fonts used by Github don't venture this far. Fortunately font substitution fills in the gaps. Perhaps the successful acceptance of this proposal will provide a suitable incentive for other font designers to support more mathematical symbols!

stevengj commented 4 years ago

Thanks!

If we pursue a combining-character approach, is it possible for the font to provide the superscript as a ligature, e.g. for "<lowercase a><mathematical superscript combining mark>", with no changes in font-rendering software?

cormullion commented 4 years ago

The "contextual alternates" feature can detect patterns in text and swap glyphs in to make that section of the text better. So for example, the feature code :

sub hyphen' greater by hyphen.alternate;
sub hyphen.alternate greater' by greater.alternate;

looks for - followed by > and then replaces both glyphs with slightly different ones that join up.

So with this mechanism it would be possible to look for, say, an x followed by 2 and then replace the 2 with a superscript, giving x squared. Of course that would do it everywhere... And without that font you'd see x2.

I'm not sure this does what you want...

stevengj commented 4 years ago

The pattern we would want to look for for would be a character followed by the new combining character; from this document

It sounds like what we want is a ligature:

Ligatures and contextual alternates basically serve the same purpose: they offer solutions for problematic or unwieldy glyph sequences. Although their name implies that they connect two or more glyphs, some ligatures actually solve problems by shortening letter parts, exactly like some contextual alternates do. So why are they two distinct features? The fundamental difference between them is that ligatures replace two or more glyphs with one combined glyph, while contextual alternates only change the appearance of one glyph at a time. A ligature typically resolves one single awkward glyph combination; a contextual alternate can offer solutions for many different graceless scenarios.

We want to replace two or more characters (e.g. a Latin letter followed by the superscript combining character) with a combined glyph (a superscript Latin letter), so that sounds more like a ligature?

On the other hand, this is more analogous to Unicode Variation Sequences (how emoji variants are handled), which seems to be yet another mechanism in OpenType…

So I'm a little confused about which mechanism one would use to implement this.

cormullion commented 4 years ago

In monospaced fonts, you don't want to replace two glyphs with one, going from occupying two spaces to one. That's why the 'ligatures' approach used in fonts like Fira Code actually uses the contextual alternates approach, replacing two glyphs with two redesigned glyphs that still occupy the same space. (And you can typically put your cursor/caret between them, which wouldn't be possible if it was a single glyph.) The replacement glyph can't be two spaces wide, I think, otherwise you 'break' the monospaced-ness.

One question is, what is in the file/terminal buffer? With the -> example above, the source file/terminal buffer always retains those two characters; only their on-screen appearance changes. If you flip from font A to font B, the appearance might switch from ligature to non-ligature, revealing the two original characters. What happens when you alternate between fonts that do and don't support the combining character...

The emoji handling might be worth investigating - I know nothing about this though.

stevengj commented 4 years ago

In monospaced fonts, you don't want to replace two glyphs with one

I'm talking about a combining character here, which has zero width and no glyph even in a monospaced font.

What happens when you alternate between fonts that do and don't support the combining character...

Then the preceding character would display normally (no superscript) and the combining character would be invisible (since it is a zero-width char with no visible glyph), assuming you have a fallback font that knows that this is an invisible combining char.

cormullion commented 4 years ago

Does your proposal make use of OpenType's sups/subs features? https://docs.microsoft.com/en-us/typography/opentype/spec/features_pt#-tag-subs

When I've seen this (eg on the Typography panel on MacOS) it seems to apply to the entire document or a selection - rather than to a character.

stevengj commented 4 years ago

The proposal here is for an addition to the Unicode standard. It is agnostic regarding how the proposal would be implemented in a font or text-rendering system, but we would like to offer reasonable suggestions in the latter regard.

The proposal does mention the OpenType scientific inferiors/superiors feature as a possible implementation aid. Apparently the sinf feature is preferred to subs for mathematical use.

cormullion commented 4 years ago

oh sorry, I didn't look inside that TEX file...😀

The sinf/subs features currently work well (eg using the Typography panel in MacOS using JuliaMono and a compliant text editor application), but the transformations are applied to all the text not a single occurrence.

I think I can implement any feature that's available to an OpenType font (if it's supported by software such as Glyphs), but the mechanics of text-rendering on a user's device are probably outside that domain, or at least outside my knowledge.

Let me know if there's anything font-specific you need.

cormullion commented 4 years ago

Just an addendum - the variational selectors idea works; at least, I managed to get it working in VS-Code:

variational-selectors-1

What you can't see is me typing the U+FE00 and U+FE01 Variational Selector characters using MacOS' Unicode Hex Input. The font has a feature set that replaces the normal character such as "2" with a superscript or subscript version, if the VS1 or VS2 characters are input immediately afterwards.

stevengj commented 4 years ago

Supposing the Unicode consortium added two new combining characters U+XXXX and U+YYYY for sub/superscripts. Could you use an existing font feature to make those convert the preceding character to a sub/superscript glyph (at least for a subset of characters if needed)?

cormullion commented 4 years ago

Yes, VS1 and VS2 are just zero width characters with special powers, so presumably any character given the same special powers would work. I'm assuming a subset, because each character in the subset has to be included individually by name in the feature. Well, perhaps a few thousand wouldn't be impossible, since it's scriptable...

stevengj commented 4 years ago

so presumably any character given the same special powers would work

But would you be able to give the character these "special powers" just by modifying the font data, or would you need to change the text-rendering software as well?

In the proposal, we are suggesting that < 200 characters be initially supported.

cormullion commented 4 years ago

I'm kind of assuming that the VS1 AND VS2 codes are interpreted by Unicode routines in the text-rendering software - all the font is doing is obeying the resulting instructions from the application to supply a superscript glyph.

Perhaps you have to ask for "new" VS1 and VS2 codes...? Or perhaps it's OK to use them here? Might be up to Consortium rules, if they're listed anywhere.

stevengj commented 4 years ago

No, they have to be new codepoints — we're trying to add a new semantic meaning in Unicode, not simply change the font rendering of existing characters.

Couldn't the combination <U+0061><U+XXXX> (where XXXX is the new mathematical-superscript codepoint) be registered in the font as a ligature for the glyph ?

cormullion commented 4 years ago

Currently I've got feature code like this:

sub a' VS2 by uF001A;

which will swap out a and replace it with the superscript version in U+F001A if the user follows the a with the VS2 character.

stevengj commented 4 years ago

Let me be more concrete:

What would you do if Unicode 15 came out next week introducing a new superscript combining character U+AF001, and you wanted <U+005A><U+AF001> to display as a superscript Z? (There is currently no superscript Z in Unicode.)

Would you be able to do it with existing font/display software by defining e.g. a new ligature glyph for that combination of codepoints?

cormullion commented 4 years ago

(Hypothetically) I can add this code to the font:

sub Z' VS20 by uF0019;

where Z is the way you specify <U+005A>, ' targets it for replacement, VS20 would be a possible internal name for U+AF001, and uF0019 is the font's internal name for the glyph that will be drawn instead of Z. (This doesn't have to be a superscript Z glyph, or even any existing Unicode character. I could replace Z with a picture of a dragon if the font had one; fortunately the font has a superscript Z glyph at that location.

stevengj commented 4 years ago

VS20 would be a possible internal name for U+AF001

Just to be clear, you can add that new internal name yourself, without any update to OpenType?

Can you link to docs for how you would do all this (i.e. how to define a name for U+AF001, how to define a substitution like this, and how to define a name for a new glyph), which I could cite from the proposal?

cormullion commented 4 years ago

Just to be clear, you can add that new internal name yourself, without any update to OpenType?

The names VS1 to VS15 referring to the current variational selectors are actually provided by the font design software, who prefer to provide readable names to users rather than hexadecimal codes. I'd assume that any "new superscript combining characters" introduced by the Unicode Consortium would be allocated a similar name by the font design software developers in an update. I could then use the name in the feature code. VS20 would be a possible name... I'm trying to guess likely scenarios here, but who knows what developers might do?

Good resources for the various tasks involved here are:

and the Glyphs manual in general is very informative:

https://glyphsapp.com/downloads/handbook/Glyphs-Handbook-2.3.pdf (PDF)

I hope some of these comments have been helpful. For more authoritative answers I'd recommend talking to an expert; I've only been experimenting in this field for a couple of months...

stevengj commented 3 years ago

I'd assume that any "new superscript combining characters" introduced by the Unicode Consortium would be allocated a similar name by the font design software developers in an update.

That's what I'm wondering about. So it is necessary for the font software to be updated before you can use a new combining character? Why can't you simply say that "codepoint combination XY corresponds to glyph Z" without updated software?

stevengj commented 3 years ago

For more authoritative answers I'd recommend talking to an expert

Do you know any experts that would be willing to help advise us on our proposal? @StefanKarpinski has been looking into this as well.

cormullion commented 3 years ago

Do you know any experts that would be willing to help advise us on our proposal? @StefanKarpinski has been looking into this as well.

Try TypeDrawers.com, where all the professional type designers hang out.