OpenType / opentype-layout

opentype-layout working group documents
18 stars 4 forks source link

Update OpenType to make AAT and Graphite completely obsolete #7

Open davelab6 opened 5 years ago

davelab6 commented 5 years ago

Apple AAT and SIL Graphite are 2 SFNT based font shaping technologies that can do things the GPOS/GSUB can not do.

Should a state-machine-driven lookup mechanism be added OpenType, that is super-fast and on par with what’s available in Graphite and Apple’s AAT, making them finally obsolete? :)

behdad commented 5 years ago

Hint hint: https://github.com/OpenType/opentype-layout/blob/master/proposals/complex_contextual.md

I was very excited when Martin and I came up with that. I promised him to write the intro (aka rationale) to that proposal, which I never did. But that proposal includes best of state machines and then some.

I'm still interested pursuing that, if there's interest / commitment from MS.

Or just tlfold morx into GSUB and call it good.

be5invis commented 5 years ago

I heard @clerkma have a great format that covers all needs.

be5invis commented 5 years ago

Hmmm, start to write something interesting. Note: this (WIP) document is written in a very formal way, i.e., using small-step semantics to explicitly define the beharior of substitution.

transducing.pdf

tiroj commented 5 years ago

I understand and appreciate the need to either specify the lookup format as in the Complex Contextual Chaining Lookup type proposal or define the behaviour of a substitution in a formal way, but speaking as a type designer, I would find it really useful to have some examples of the kinds of things one could do with these approaches, and how one would apply them.

be5invis commented 5 years ago

@tiroj Or, could we simply adopt Graphite and make it official?

behdad commented 5 years ago

Or adopt AAT which will be universally supported in browsers later this year.

be5invis commented 5 years ago

@behdad To clearify, which "way" is your preferrence: extend OTL to make it having same ability of ATT/Graphite, or completely throw away OTL and use a new shaping model. Option 1 is better for like OpenType 1.9, but option 2 is a good idea for “OpenType 2.0”.

mhosken commented 5 years ago

For me it isn't the weaknesses of GSUB that make me cry, but the weaknesses of GPOS. And AAT doesn't help with that.

tiroj commented 5 years ago

Agreed: GSUB occasionally requires me to do some funky stuff to get glyphs in the right order for subsequent processing, but GPOS actually prevents me from doing things. AAT has been a non-starter with the majority of type designers for quarter of a century, and without solving adjacency and positioning issues it's unlikely to gain any more support now.

Since we're throwing around ideas for alternative layout approaches, I'm going to mention DecoType's ACE, which demonstrably does solve the adjacency and positioning issues, even though currently very few people know how.

be5invis commented 5 years ago

@tiroj

GSUB occasionally requires me to do some funky stuff to get glyphs in the right order for subsequent processing, but GPOS actually prevents me from doing things.

Could you please elaborate? And can Graphite handle all the cases? If so adopting Graphite would be a good option. I heard that Graphite can do very fine-tunes position adjustments but I did not reviewed its semantics.

A note: I do not want DWrite or HarfBuzz’s API changed after we change the shaping process, so some ideas like mixing GPOS and GSUB would have a risk of API change. Also be careful about cluster map, which is critical for editors.

tiroj commented 5 years ago

I'm not familiar enough with Graphite's positioning model to comment on that.

With regard to GPOS, there are problems with interaction between spacing and mark positioning that affects some writing systems very badly, notably cascading Arabic styles such as nastaliq and diwani*, and Telugu. I spoke about these issues at TypeCon a few years ago: Problems of Adjacency.

Recently, I've been working on Telugu again, and pushed things about as far as I could using massive numbers of contextual GPOS lookups to adjust spacing over lateral subscripts. It's ugly.

tiroj commented 5 years ago

I'd probably revise some parts of that 2014 presentation based on my more recent Telugu experience in which I decided to try to support Sanskrit as well as Telugu language text. My characterisation of Telugu shaping in OT on page 29 of the presentation PDF — 'it is all reasonably do-able' — now seems rather optimistic.

mhosken commented 5 years ago

I'll have a try at listing some of the key positioning capabilities of Graphite that enable us to produce fonts for different scripts, including Nastaliq.

Specifically to support Nastaliq we also added:

In answer to your questions, @be5invis, yes Graphite handles the char to glyph and glyph to character mapping very well (better than OT) and there is code that can create OpenType clusters out of the results from Graphite, evidenced in Harfbuzz.

Areas where OT does positioning stuff that Graphite doesn't do:

Graphite doesn't exist in opposition to OpenType, but as an alternative particularly for the most complex shaping and positional needs.

In my experience of working on complex script needs in OpenType, I have found that even with sophisticated compiler/macro layering on top of OpenType, it just cannot express what needs to be expressed with regard to positioning in some cases. So having another string to our bow would be nice.

While I'm sure you could come up with a use case that Graphite can't handle, I don't know of any and we have fonts in a wide range of scripts.

tiroj commented 5 years ago

re. nastaliq:

Shift collision avoidance for nuqtas and diacritics within a cluster

Do you have recent documentation on this? The initial version that I saw at TUC a few years back did a kind of approximation of nuqta and diacritic positioning which was definitely an improvement over trying to use GPOS, but didn't always correspond to script-normal positioning. As I recall, this was partly due to performance optimisation constraining the allowable angles of movement. Is this still the case, or is it possible to customise the collision avoidance algorithm if one wants to try to target traditional positioning norms?

I wrote up the algorithms in an unpublished paper that I would be happy to share. The main point about working in 45 degree space is one of speed given the workload is at least O(N^2) by dimension. Given all movement is in units of 45 degrees it makes sense to have outline approximations in the same space. This means that a single move is an optimal move against a 45 degree approximation to the outline. If doing multiple steps helps, that is still only an O(N) increase in cost so worth taking that route. The results we have seen are excellent generally.

Another way of saying this is that if you want to solve the problem exactly, it is going to be prohibitively expensive to calculate for minimal gain in quality. Yes, this is an approximation, but it's a good one that enables us to get a solution in a timely fashion. Happy to discuss further offline if that would help.

be5invis commented 5 years ago

@tiroj @mhosken

DISCLAIMER: This is NOT a proposal from Microsoft, just from me.

My idea, if we want to refactor OTL, we can simplify the lookup thing into a flat list of rules:

matchState, (B, N, C), recognizer ▶ action

where B,N,C are natural numbers and N>0, recognizer is a function that takes (B+N+C) glyphs (in GSUB) or (B+N+C) glyph-position pairs (in GPOS), and return either TRUE or FALSE.

When performing substitution or positioning, If the current state (an integer) is equal to matchState and the recognizer returns TRUE for the non-ignore glyphs, action would be performed, to:

Recognizers and actions can have a huge space of flexibility and extensibility, they only need to be conformal with the “type”:

be5invis commented 5 years ago

@tiroj @mhosken Image of my concept: image

DISCLAIMER: This is NOT a proposal from Microsoft, just from me.

NorbertLindenberg commented 5 years ago

@be5invis Is there serious interest at Microsoft in improving OpenType layout in any substantial way? Are you speaking for Microsoft with your proposals and questions, or just for yourself? My impression was that Microsoft as a company considers OpenType layout done and has moved key people on to other projects. I’d be happy to hear that my impression was wrong.

be5invis commented 5 years ago

@NorbertLindenberg I am speaking for myself, not company. However the text people (like Andrew Glass, I am not sure whether he uses GitHub) have interest on improving OTL, but there are a lot of concerns, like API stability and performance. Also, they seldom express their idea to the public.

behdad commented 5 years ago

@behdad To clearify, which "way" is your preferrence: extend OTL to make it having same ability of ATT/Graphite, or completely throw away OTL and use a new shaping model. Option 1 is better for like OpenType 1.9, but option 2 is a good idea for “OpenType 2.0”.

I don't support throwing away and restarting. There's no indication that we can do better, and it will just waste a lot :). I support either integrating AAT-compatible machines into GSUB, or add something like the Complex Contextuals proposal that Martin and I produced, which is state-machine-equivalent but has several nice properties in terms of storage and integrating with existing lookups. We probably will work more on it again later this year.

https://github.com/OpenType/opentype-layout/blob/master/proposals/complex_contextual.md

behdad commented 5 years ago

Agreed: GSUB occasionally requires me to do some funky stuff to get glyphs in the right order for subsequent processing,

This attitude was what broke down advancement of layout when we tried back in 2015. Just because this is not the top difficulty, it doesn't mean it's not worth addressing.

mhosken commented 5 years ago

The point I was making is not that we shouldn't develop GSUB and make it better, but that just doing that is insufficient, on its own, to meet the needs out there.

aminanan commented 2 years ago

I was trying to implement a finite state machine lookup based on the Complex Contextual Chaining Lookup proposal. I've come to the following conclusion. Correct me if I'm wrong, please.

If we need to implement the full expressiveness of regular expressions with unbound repetitions (i.e., Kleene star), we should support submatch extraction. See for example Tagged Deterministic Finite Automaton, an extension of deterministic finite automaton capable of submatch extraction and parsing. (Consider a tag as an action or lookup to apply at a given position after a match.)

Without submatching, we will not be able to express a simple rule such as

@class1* @class1 ' lookup l1 @class1

which specifies to apply the lookup l1 to the second to last glyph of a series of glyphs belonging to @class1. Graphite avoids the problem by not implementing Kleene star. The proposed complex lookup and AAT would support Kleene star only if it does not interfere with some action or lookup such as

@class1* @class2 ' lookup l1