Open moyogo opened 8 years ago
@benkiel Indeed, Glyphs.app and ufo2ft follow specific rules that get translated to AFDKO feature syntax, which then gets translated into GPOS lookups. Adobe’s markFeatureWriter.py follows similar rules but isn’t as complete. In short, an anchor will translate to a mark-to-mark lookup only if it is both a base anchor and a mark anchor in glyphs that also have mark anchors, and an anchor will translate to a mark-to-liga lookup only if there are matching numbered anchors in ligature glyphs. FontForge follows rules much closer to the model of GPOS lookups.
@typoman The AFDKO feature syntax is higher level than GPOS lookups so you won’t see exactly the same structure. In short the _top is only defined once in the AFDKO example you give, but it will be duplicated, one in the mark lookup and another one in the mkmk lookup in GPOS. So you could very well have "_top", "_topmkmk" (at the same coordinates), and "topmkmk" in circumflexcomb that will produce the same GPOS structure as having "_top" and "top".
Glyphs.app and ufo2ft make the anchor type implicit as long as you follow their rules. FontForge makes the anchor type explicit. The UFO spec doesn’t currently mention that there is a relation between "_top" and "top".
Sorry, I’m just babbling about how different authoring tools work without providing a way forward. But we may need to either specify rules in the spec that follow or a close to what Glyphs.app and ufo2ft do or a way for the user to be more specific.
@gferreira’s skipExportAnchors
as a list of anchor not to export would already help the user.
I think we should reconsider adding a lib
to anchors after all, given all the possible uses for anchors listed in this thread alone.
One other thought, if we allowed a mark to have more than one type, it would accommodate what Glyphs/UFO does and also what FontForge does and allows for anchor re-use but also more specificity.
Here I've gathered some information about how tools deal with anchors in UFO. I hope this could be useful for people who are trying to figure out how things already work in UFO tools. Some of this is just copy-paste from different places in GitHub.
FDK fea file syntax:
position cursive glyph.name <anchor x y> <anchor x y>;
The location of anchors is written exactly as they’re stored in the UFO in a rounded integer type. The first value record is the entry
and the second is exit
. If a glyph is missing one of these anchors, its location should be written as NULL <anchor NULL>
.
The anchor can be defined by writing #entry
or #exit
on the anchor name.
Its mechanism is based on the Glyphs app and in turn older mark feature writers. In ufo2ft if any of the supported features are already present in the feature file, it is not generated again. ufo2ft parses anchor.name
using regular expressions and according to its results creates an object called NamedAnchor. This object is used to make all the mark related features and it has three main variables:
isMark bool
variable. Any anchor that starts with mark prefix which is typically a _
. If the parent glyph have any isMark anchor then the glyph should be a mark type, otherwise it should be a base or ligature type. Examples:
If anchor.name
== _top
-> isMark
= True
If anchor.name
== top
-> isMark
= False
key string
variable. This is used to diffenetiate mark class types (e.g. top
or bottom
). Examples:
if anchor.name
== _top
-> key
= top
if anchor.name
== bottom_2
-> key
= bottom
number int
variable. This is used for logical order of mark in the ligature and it can also indicate if the parent glyph is a ligature type. Examples:
anchor.name
== _top
-> number
= None
anchor.name
== top_2
-> number
= 2
Any glyph types (ligature, base, mark) should be defined in the GDEF part of the feature file in UFO. Otherwise, variables of NamedAnchor are used for its type definition.
ufo2ft creates mark features data while iterating NamedAnchor
objects collected from the UFO glyph anchors:
If the NamedAnchor
is not a mark e.i. isMark == False
(e.g. top
, top_1
):
If NamedAnchor
has a number (e.g. top_1
):
Define MarkToLigature positioning inside the mark
feature according to NamedAnchor.key
and put the anchor in the order according to its number
:
position ligature glyph.name
<anchor x y> mark @MC_top # number = 1
ligComponent
<anchor x y> mark mark @MC_top; # number = 2;
# MC in the class name stands for Mark Class
If NamedAnchor
doesn't have a number
(e.g. top
):
Define MarkToBase positioning inside the mark
feature according to its key
:
position base glyph.name <anchor x y> mark @MC_top;
If the parent glyph contains an isMark NamedAnchor
(e.g. top
) define the current NamedAnchor
(e.g. _top
) as MarkToMark positioning inside the mkmk
feature:
position mark glyph.name <anchor x y> mark @MC_top;
If NamedAnchor
is a mark (e.g. _top
):
Define a MarkClass class definition using its key (e.g. top
)
markClass glyph.name <anchor x y> @MC_top;
While defining mark features, this is considered:
top
and bottom
mark lookups are defined separately, and the lookups get a MarkAttachmentType
flag to ignore processing any other mark classes. For example lookup for MarkToMark positioning of bottom
marks gets:lookupflag MarkAttachmentType @MC_bottom;
Script (writing system) exceptions:
If any glyph is considered an indic script (Beng
, Cham
, Deva
, Gujr
, Guru
, Knda
, Mlym
, Orya
, Taml
, Telu
) then its feature is not defined inside the mark
or mkmk
feature. Instead according to the following criteria they go to above mark feature abvm
or below mark blwm
feature:
abvmAnchorNames = {"top", "topleft", "topright", "candra", "bindu", "candrabindu"}
blwmAnchorNames = {"bottom", "bottomleft", "bottomright", "nukta"}
Tibetan script (tibt
) mark features do not go to abvm
and blwm
features but just one mkmk
. Its lookup also doesn’t get lookupflag MarkAttachmentType
as that prohibits the attachment of marks to different anchors than the previous mark.
abvm
(for Above Marks) or blwm
(for Below Marks), and the Indian scripts option needs to be checked in the UI.MarkAttachmentType
lookupflag is no longer added to MarkToMark lookups meant for the abvm
and blwm
features (Indian scripts).COMBINING_MARKS
. I guess this is to avoid generating mark for glyphs which don't need to have mark positioning feature._aboveLC
, _aboveUC
and _aboveSC
instead of just _above
for all the three cases) so that the position of the anchors can be tested in FontLab. But this distinction is not necessary for the mark
feature, and that is why this script allows for those casing tags to be trimmed, by setting the value of kDefaultTrimCasingTags
.LIGATURES_WITH_X_COMPONENTS
, where X
should be replaced by a number between 2 and 9 (inclusive). Additionally, the names of the anchors used on the ligature glyphs need to have a tag (e.g. 1ST
, 2ND
) which is used for corresponding the anchors with the correct ligature component.I dropped some minor details. For more details read the source of markFeatureWriter in ufo2ft or WriteFeaturesMarkFDK in python module repo of the FDK.
To interpret what’s the purpose of anchor there is lots of guesswork on the anchor name and glyph data during the binary compile. Some features are generated only on the compile without getting exposed to the user (composites anchor propagation). Writing OpenType features is a user’s job, not the compiler. An authoring tool could automate some of it (like propagating composite anchors) but there shouldn't be anything left for compiler's guessing. Also for the sake of transparency adding some attributes to anchor can help to remove the guesswork and give more control to the user. My suggestion is either to have an anchor lib or the following attributes:
Explicit anchor attributes to define its definition:
anchor.type
string attribute as there can’t be two types per anchor (Entry
, Exit
, MarkToBase
, MarkToMark
, MarkToLigature
, MarkClass
).anchor.index
for the anchor in the MarkToLigature anchor type. One might say this could be interpreted from the anchor order inside the glyph, but it’s not easy to read it for humans. Also, it should be visible for the user that logical order in RTL
scripts is starting from the right. Adding any numbers to the anchor.name
could also add more to the guessing and is best to be avoided. In the Glyphs app, numbers could be added to anchor names because anchors with similar names are not allowed.top
, bottom
) can be written in the anchor.name
and there can be multiple anchors with the same name in the glyph but with different anchor.type
.Still, the compiler needs to guess how to write the flags, lookups and which feature the anchor definitions belong to. Again this can be automated by an authoring tool and saved in the UFO but it shouldn't be a compiler's guesswork. This could be achieved with data structures on the font level and anchor could reference that data. This could be one way of doing it:
anchor.flags
or anchor.lookups
list, so the tool explicitly writes the lookup flags instead of the compiler. The list items can be flags or the lookup name in string type. This attribute also helps to create separate lookups if needed. If anchor.lookups
is added, then there should be another place to write the lookup information (e.g. font.lib
) and there is no need for feature
attribute since the lookup should have a list of features which it belongs to.anchor.contextBefore
and anchor.contextAfter
list attributes that are used in Arabic and Indian scripts. The list items can be a glyph.name
or a group name
or list of glyph.name
(s). FDK has a nonpublic script to write this based on the anchor name and it has limitations for writing the context.In the end, maybe anchor.lib
could be an easier solution instead of adding all these attributes. Since there is no anchor lib in UFO, personally I'm thinking that I will write the mark feature inside the data folder. I might have my own syntax for the feature file which is easier to read and diagnose but I haven't finished it yet.
I think Glyphs.app uses a naming scheme … For ligature anchors it would use _name_1, _name_2 etc, afaik, may be not fully correct though.
The ligature anchos are without underscore prefix.
Would defining glyph.lib["public.*"] keys for glyph type and ligature component count be helpful?
Glyphs.app defines several attributes for each glyph: script, category (letter, mark), subCategory (ligature, nonspacing), decomposition (list of glyphInfo objects (that each have all the above info)). ligature component count can be computed from the decomposition info (iterate it and count the number of none mark glyphs).
As explained in the issue you reference, adding lib to anchor is problematic, so please focus on solutions that don't require that.
In the issue it stats that is complicated and .ufo doesn't support it. Both problems can be solved.
Regarding anchor.type: what are the needed values for such a field? In the above comments I read: base, mark, ligature, entry, exit
sometimes anchos are only used to position components. That can be true for all five types. So a flag that says: Don't consider when generation GPOS.
For entry and exit a prefix is needed to differentiate it from the mark anchor. The prefix for mark anchor is _, the prefix for cursive anchor is # so it becomes #entry or #exit.
entry/exit anchor don’t need differentiations. They have a unique name. If you add any suffix (a '#' or an emoji) the anchors are used to (cursively) position components but are ignored when generating GPOS (see above). I see how it could be possible to add options to have multiple cursive lookups. But maybe I’ll wait what is decided on this topic here.
About the "Anchor Definition" and "Lookup definition": this is a very detailed and good description what is needed. But it might be too complex for most people. We need to find a good balance.
the anchor.type
should allow custom types. e.g. In Glyphs.app, you can add a 'LSB' anchor to define alternate metrics for the palt
feature (used in CJK fonts)
My understanding of the current state is that we're going to add a .lib
to the anchor, with that any type can be stored.
Good. The comment was just for the .type case.
@schriftgestalt I think @justvanrossum has folded on this, .lib
seems to be current consensus. We'll want to register some standard keys for anchor for common uses, of course.
I understand. I had wrote that just before he meeting and posted it because I had written it.
Sorry if I miss or repeat something, just 2 cents to @khaledhosny comment:
One thing that would still be unsolved is contextual mark positioning. I have no idea how that would be supported for anchors without full OpenType machinery in the format. So I guess people will have to keep writing that manually or have font-specific scripts to handle them.
Would be useful to have anchors accessible by name in features.fea
file via anchor format E, so that glif anchors automatically create anchorDef
s - that would simplify contextual rules.
position cursive meem.medial <anchor entry_default> <anchor exit_default>;
position cursive @BACK_COND meem.medial' <anchor entry_special> <anchor exit_special> @AHEAD_COND;
Update. anchorDef
is global, not per-character, so this solution is unsustainable.
For ligature anchors, many UFOs designed with one authoring tool don’t work with other authoring tools as they have different ways of storing this information. Some authoring tools expect specific suffixes (like
_1
,_2
,... or#1
,#2
,...) while others expect specific prefixes. It would be better to standardize this, either in the name or preferably with an attribute (for exampleligatureIndex
)./cc @graphicore @khaledhosny @jamesgk