silnrsi / font-ttf-scripts

Font::TTF::Scripts perl module
Artistic License 2.0
19 stars 7 forks source link

TypeTuner: provide a variant to disable ligatures #24

Closed mirabilos closed 11 months ago

mirabilos commented 1 year ago

I just discovered that Gentium Plus automatically creates a ligature from ff.

This is plain wrong:

I get it’s fancy and people want this and all, but for those of us who want sanity, please add an option to TypeTuner Web to have the resulting font disable automatic ligatures. (It’s, after all, the only permitted way to customise default settings of your fonts.)

jvgaultney commented 1 year ago

Hi - this issue is in the wrong project, as the control of features is font-specific. We won't be changing the fonts as you recommend. Here's why:

The U+FBxx presentation forms are not intended to be used this way. They only exist for backwards compatibility reasons. You really don't want to use those characters in your text. Using f+l is the recommended way to get an fl ligature. See the Unicode FAQ on Ligatures, Digraphs, Presentation Forms.

Many applications already include the ability to turn off automatic OpenType ligatures. it's possible to do that already in CSS. In even more cases you can insert a zero width non-joiner (ZWNJ, U+200C) between the characters to tell the app specifically to not form a ligature.

Finally, our standard ligatures are designed to be subtle, and should not cause confusion or distraction, even when the letters cross compound word boundaries. I know that some German typographic traditions are strict about this, but that situation can usually be solved by turning off ligatures or using a ZWNJ. Can you give an example where one of our automatic ligatures causes a problem that you can't fix with these techniques?

Is this specifically for German? One way we could improve this is to create a German-language-only feature that disables the ligatures in all cases. However that probably would not work any better or more widely than the existing techniques.

Here's an example of inserting a ZWNJ to break a ligature in Microsoft Word:

image

mirabilos commented 1 year ago

Hi,

The U+FBxx presentation forms are not intended to be used this way. They only exist for backwards compatibility reasons. You really don't want to use those characters in your text.

but I do, because that’s when I want ligatures.

Using f+l is the recommended way to get an fl ligature. See the

Depends on whom you ask. It’s a matter between opting in to them and opting out of them.

Many applications

The CSS options for the webbrowser, which I began using in 2022, is unfortunately the first application which I use that can toggle OpenType features at all. Most applications can barely do more than bold/italic/underline (I’m thinking of my musical notation software here, for example). I cannot rely on applications ☹

In even more cases you can insert a zero width non-joiner (ZWNJ, U+200C) between the characters to tell the app specifically to not form a ligature.

The problem with the opt-out approach is that it requires explicit opting out by using characters that do show up when copy/pasting, whereas opting in will use codepoints that are specifically designed for that (though with the same argumentation, ZWJ in the middle most likely would also work).

Finally, our standard ligatures are designed to be subtle, and should not cause confusion or distraction, even when the letters cross

They’re somewhat subtle, in visual output, yes.

My problems with this come from a more technical PoV; first the copy/paste issue I outlined above, and second… this needs a bit more explanation, I guess.

I use a “chrooted” conversion routine for sheet music. This generates PDFs, MIDI files, etc. from sources automatically, and then uses mutool draw to convert the PDF to plaintext and then looks for U+FFFD inside, which is inserted when none of the fonts available during the process (which is deliberately limited to only a handful, all of which are embeddable) has a glyph.

Unfortunately, it’s also inserted for the ligature glyphs, because they don’t have an UCD codepoint mapping (even those like f+f which could), leading to many fale-positives.

Yeah, I guess this is a hard niche case, but for this specific use case I’d have preferred to just opt out. I have since switched to using GhostScript for plaintext conversion, which SEEMS to insert NUL for missing glyphs and the bare “subset glyph index”, such as \x04 for <0004> in the PDF, for ligatures (and has the nice property of leaving an actual � in the source alone, though these don’t show up in my sheet music) and have to hope this stays working.

I know that some German typographic traditions are strict about this,

Then you know more about those than I do… but this underlines (hah) my point?

Is this specifically for German? One way we could improve this is to create a German-language-only feature that disables the ligatures in

No, it’s not, and please don’t. They WILL cause problems, though, for copy/pasting e.g. command options from technical documentation (e.g. “--suffix” from the GNU tar(1) manpage), though controlling the text that’s copy/pasted from PDFs is, of course, a nightmare onto itself, especially in, say, pdflatex.

I could switch the font for the flag, of course, but too many font switches in a paragraph that explains things (as opposed to one that is mainly comprised of text to actually type) IMHO is less legible.

Here's an example of inserting a ZWNJ to break a ligature in Microsoft Word:

I don’t use any Microsoft software, though. I can just press `x200c↵ in my text editor to insert one, but, see above, I strongly believe that for most texts in an environment where you cannot control OpenType features (because the software is not meant to be a DTP program but just “has text”, like musical notation software) opt-out ligatures are wrong, which adds to the problem. And this is not just musical notation software but also tech docs, where 1:1 correspondence between input and output codepoints is often needed.

I think that in situations where I’d use a program sufficiently into DTP to have OpenType feature control, I’d have less problem with opt-out ligatures, and perhaps those can handle the “--suffix” case as well.

And is not the type tuner specifically to help people who use programs that are not DTP-ish? I even use them in php+libgd scripts that render words (such as page titles) as PNGs…

I’ve had sufficient trouble with the much-too-wide line spacing of the Gentium Plus font when deciding to finally move off Gentium 1 and the type tuner was helpful, but only for most cases; remote sites, such as the online sheet music sharing service, would, of course, not have the typetuned variants; the software can control line spacing for lyrics only, not for text blocks, so I’ve got to substitute the old Gentium Basic (they also don’t have Gentium) for Gentium Plus with reduced line spacing before uploading, which totally breaks Cyrillic, of course…

bye, //mirabilos -- 22:20⎜ The crazy that persists in his craziness becomes a master 22:21⎜ And the distance between the craziness and geniality is only measured by the success 18:35⎜ "Psychotics are consistently inconsistent. The essence of sanity is to be inconsistently inconsistent

jvgaultney commented 1 year ago

I appreciate the trouble you're having.

The problem with the opt-out approach is that it requires explicit opting out by using characters that do show up when copy/pasting, whereas opting in will use codepoints that are specifically designed for that (though with the same argumentation, ZWJ in the middle most likely would also work).

Since in the great majority of situations people want ligatures an opt-out approach is best. For copy/paste you actually want the ZWNJ included in the pasted text, so the application you're pasting into has a chance of resembling the copied source.

They WILL cause problems, though, for copy/pasting e.g. command options from technical documentation (e.g. “--suffix” from the GNU tar(1) manpage), though controlling the text that’s copy/pasted from PDFs is, of course, a nightmare onto itself, especially in, say, pdflatex.

Good PDF generators store both the glyph stream and the underlying character stream in the file, so it's available to PDF readers/interpreters. Then when you copy/paste you get the original chars, not a guess of what the glyphs might have represented. If it fails to do this it's a bug in the generator or reader, not the font.

I can just press `x200c↵ in my text editor to insert one, but, see above, I strongly believe that for most texts in an environment where you cannot control OpenType features (because the software is not meant to be a DTP program but just “has text”, like musical notation software) opt-out ligatures are wrong, which adds to the problem.

The ZWNJ is part of the text, and is intended to signify a specific sequence in which a ligature should never be allowed to form, which in the case of fi fl etc. is unusual.

And this is not just musical notation software but also tech docs, where 1:1 correspondence between input and output codepoints is often needed.

Unicode is a character encoding, not a glyph encoding, so you can never assume a 1:1 correspondence if you support Unicode. If you want to minimize situations where there might be a more complex correspondence then use a font that has no OpenType at all, or one specifically intended for that technical use (such as a monospaced terminal font).

And is not the type tuner specifically to help people who use programs that are not DTP-ish? I even use them in php+libgd scripts that render words (such as page titles) as PNGs…

ZWNJ should work fine for that.

I'm still not convinced that a TypeTuner Web option to disable automatic ligatures is needed. However I will add it to the list of requests for later review.

mirabilos commented 1 year ago

Victor Gaultney dixit:

For copy/paste you actually want the ZWNJ included in the pasted text, so the application you're pasting into has a chance of resembling the copied source.

No, definitely not, as “--suffix” is a syntax error!

Good PDF generators store both the glyph stream and the underlying character stream in the file, so it's available to PDF readers/interpreters. Then when you copy/paste you get the original chars, not a guess of what the glyphs might have represented. If it

I’ve seen this extremely rarely, unfortunately. MuseScore uses Qt, so all Qt applications don’t do that, and the less is said about PDF copy/pasting for pdflatex output the better… (it even reuses parts, so for example…

something that’s always the same somethingdifferent
something that’s always the same etwasanderes

… pastes as…

something that’s always the same somethingdifferent
etwasanderes

… so… gah!)

The ZWNJ is part of the text, and is intended to signify a specific sequence in which a ligature should never be allowed to form, which in the case of fi fl etc. is unusual.

In the opt-out model, yes. But the opt-out model makes a presentation control character (ZWNJ) part of the character stream.

I’d much prefer to at least optionally have an opt-in model, where only actually requested ligatures are part of the character stream.

then use a font that has no OpenType at all, or one specifically intended for that technical use (such as a monospaced terminal font).

Yeah, though that looks weird when done too often within a paragraph, at least in some scenarios.

And is not the type tuner specifically to help people who use programs that are not DTP-ish? I even use them in php+libgd scripts that render words (such as page titles) as PNGs…

ZWNJ should work fine for that.

It does; this was more commenting on use of TTFs in general.

I'm still not convinced that a TypeTuner Web option to disable automatic ligatures is needed. However I will add it to the list of requests for later review.

OK, that’s all I can ask. Thank you, and thanks for also presenting the “other side”, with arguments I can understand.

As someone in IT, I’m very wary of most “automagisms” in general, hence a strong preference in general for opt-in over opt-out, even if I can see the benefit of opt-out. If it’s possible to have both, I’m a proponent of offering both so the user can decide.

Have a nice day, //mirabilos -- Sometimes they [people] care too much: pretty printers [and syntax highligh- ting, d.A.] mechanically produce pretty output that accentuates irrelevant detail in the program, which is as sensible as putting all the prepositions in English text in bold font. -- Rob Pike in "Notes on Programming in C"

mirabilos commented 11 months ago

One way to fix the most obvious of these for my use case would have been if the ligatures for ff, ffi, ffl, etc. would be assigned the corresponding codepoint like U+FB00 etc. so that would be output by mutool draw instead of U+FFFD. That would be a font issue then, of course.

In the meantime, I found that I can switch to gs -q -dSAFER -sDEVICE=txtwrite -o - "$name" which mostly reliably outputs NUL (\x00) for only missing glyphs (it outputs the UCS mapping of the glyph for it, and the unmapped code from the embedded subsetted font, e.g. \x04, for glyphs without UCS mapping), so I can scan for \x00 to detect missing glyphs instead. (It outputs U+002F for U-0001F12F so I’d be a bit wary of nōn-BMP glyphs ending in 00, but I don’t think I have these anywhere.)

I think I’ll make this to a “me problem” thus and retire this request. (By the way, the quality of the ffi, both ligature and not, in current Gentium Plus has amazed me today when I encountered it again.)