Better anchor definition

moyogo commented 8 years ago

For ligature anchors, many UFOs designed with one authoring tool don’t work with other authoring tools as they have different ways of storing this information. Some authoring tools expect specific suffixes (like _1, _2,... or #1, #2,...) while others expect specific prefixes. It would be better to standardize this, either in the name or preferably with an attribute (for example ligatureIndex).

/cc @graphicore @khaledhosny @jamesgk

graphicore commented 8 years ago

Related questions:

How do we know, looking at a UFO source, which glyph is of type Ligature, Mark, Base, other ... ?
How many parts is a Ligature-Glyph made of? We need that to write correct mark2liga features, also for ligature carets (also needs position information). IMHO, there's nothing in UFO that helps.

moyogo commented 8 years ago

@graphicore I think another issue can be opened for a way to define ligature caret. The point of this issue is ligature anchors.

khaledhosny commented 8 years ago

There are also cursive anchors, but in general how do you tell what type a given anchor is, base, mark, ligature, entry, exit. I don’t see how one would be able to use of the anchors without knowing such essential information about it.

graphicore commented 8 years ago

@moyogo the point is that you need the type of the glyph -- ligature -- and how many components it has, for both.

@khaledhosny right. I think Glyphs uses a naming scheme where name is the base anchor and _name is the matching mark anchor. For ligature anchors it would use _name_1, _name_2 etc, afaik, may be not fully correct though. For entry and exit it probably uses just reserved names like "entry" and "exit". I'm not saying its the best solution, but naming conventions could be a solution.

typesupply commented 8 years ago

I've done a lot of work with using glyph names to indicate feature states and behavior. Pretty much every fancy OpenType font that I've written code for has had a specific naming scheme and a script that interprets it into .fea. The names are ultimately not very flexible and they are really cumbersome for designers. It's not fun to type /o.010340/o.120140 when you just want to compare a couple of "o" designs.

Anyway, I think a much more future proof solution would be to publicly define a structure that can be stored in the lib that defines GSUB/GPOS behavior for a particular glyph. That would open up a huge number of interesting possibilities. But... How far does it go? Do we define what the generated .fea should look like? Or, do we stop at saying, "this is a ligature and here's some data about it" and "this is a mark"? This is going to get really deep, really fast. Even more problematic, how is a tool supposed to know when something that exists in .fea should be replaced with auto-generated .fea code and when it should be left alone because the designer made an edit?

Don't get me wrong, I think some sort of data structure that can deeply describe the intended use of a glyph would be really useful. Defining how that is used is something that I'm not sure we can get consensus on.

justvanrossum commented 5 years ago

Would Anchor become more usable if it had an optional type attribute, that contains a (standardized) string?
Would defining glyph.lib["public.*"] keys for glyph type and ligature component count be helpful?

typoman commented 5 years ago

That would be very useful. As I mentioned in a fontParts issue, having an anchor lib also would be nice to add other information like context of mark positioning. This context information can be stored in glyph lib now, but by removing the anchor, any authoring tool should also remove that information from glyph lib. If there will be an anchor type property I prefer to store the information there instead of glyph lib. I guess I can store more elaborate data structure using XML there if it's string but not sure if this is considered standardized or not.

justvanrossum commented 5 years ago

As explained in the issue you reference, adding lib to anchor is problematic, so please focus on solutions that don't require that.

Regarding anchor.type: what are the needed values for such a field? In the above comments I read:

base
mark
ligature
entry
exit

Is that correct and complete?

Likewise for the type of a glyph?

typoman commented 5 years ago

As explained in the issue you reference, adding lib to anchor is problematic, so please focus on solutions that don't require that.

I will probably use this type property then.

Is that correct and complete?

I think so. But I don't know what's the use for ligature type though. Example in Arabic is that it's either base or mark. On the ligatures the base anchor gets an index for the order.

typoman commented 5 years ago

Likewise for the type of a glyph?

Only from this set:

base
ligature
mark
components

https://docs.microsoft.com/en-us/typography/opentype/spec/gdef#glyph-class-definition-table-overview

typoman commented 5 years ago

(Just to be clear for anchor type) Anchor type can only be:

base
mark
entry
exit

Base anchor can get an index attribute for the logical order of mark in a ligature.

typoman commented 5 years ago

Actually in adobe fea file a ligature mark and base mark are defined differently.

Base: pos base [behDotless-ar] <anchor 428 -5> mark @mark_bottom <anchor 438 368> mark @mark_top;
Ligature: pos ligature [lam_alefWasla-ar.fina] <anchor 473 -3> mark @mark_bottom <anchor 492 726> mark @mark_top ligComponent <anchor 139 -3> mark @mark_bottom <anchor 173 893> mark @mark_top;

But personally I don't need to know if an anchor is ligature type when I generate the feature. I just check if it's a base and if it has an index.

typoman commented 5 years ago

Current solution to prevent adding any property to anchor is having name conventions for anchors. The advantage of naming convention is that the intention of the anchor is visible. The name corresponds to the anchor mark that is going to be placed on the base. So if a top mark is going to be placed on a base, in the base glyph, the base anchor is called top and in the mark glyph, the mark anchor is called _top. This is the naming convention from the Glyphs app and I think it's also used in fontmake. For a ligature, the anchor index is written after the name so it becomes top_1. For entry and exit a prefix is needed to differentiate it from the mark anchor. The prefix for mark anchor is _, the prefix for cursive anchor is # so it becomes #entry or #exit.

justvanrossum commented 5 years ago

Suggestion: Describe/document the Glyphs/fontmake behavior into more detail. It may then become easier to reason about the following:

What are the weaknesses of this scheme?
How could the procedure (of generating such features) benefit from UFO enhancements?

In other words:

What is the problem?
How does the current solution work?
How is the current solution not adequate?
Can the UFO format be enhanced to solve said inadequacies? If so, how?

typoman commented 5 years ago

Suggestion: Describe/document the Glyphs/fontmake behavior into more detail. It may then become easier to reason about the following:

I will do that, thank you for pointing them out.

One note for now. If the glyph type is mark, it could include both mark and base anchors. In a glyph that is defined as a mark, the base anchor is used for mark to mark positioning. Takeaway: anchor type cannot always be inferred from its glyph type.

khaledhosny commented 5 years ago

One limitation of the above scheme is that it allows only one cursive anchor/lookup in the entire font.

typoman commented 5 years ago

Could you give an example where there is a need for multiple lookups for cursive anchor?

khaledhosny commented 5 years ago

OpenType allows it, I don’t see why we should have such a limitation otherwise. I have a font that has cursive anchors with RTL flag and without it, based on whether it wants the rightmost of the leftmost glyph to be the one setting on the baseline.

typoman commented 5 years ago

I have a font that has cursive anchors with RTL flag and without it, based on whether it wants the rightmost of the leftmost glyph to be the one setting on the baseline.

Good point. Thank you!

typoman commented 5 years ago

This also adds the question, do we need a RTL property for anchors?

justvanrossum commented 5 years ago

Is this https://github.com/googlefonts/ufo2ft/issues/303 ?

@khaledhosny, are there any other fundamental problems that are good to consider up front?

khaledhosny commented 5 years ago

No, another font that I didn’t convert to UFO since fontmake does not support cursive anchors at all.

May main problem with UFO anchors right now is that there behavior is unspecified. Tools like fontmake and ufo2ft try to follow Glyphs but 1) Glyphs is not a UFO editor 2) Its behavior is not specified either 3) it has arbitrary limitations (like supporting only one cursive anchor per font).

I prefer being explicit than implicit, so I think type, index and flagsattributes would (potentially) allow the compiler to build any kind of mark positioning lookup supported by OpenType without having to do lots of guesswork. The new attributes should be optional so that people who prefer the current behavior can continue doing so without disruption.

One thing that would still be unsolved is contextual mark positioning. I have no idea how that would be supported for anchors without full OpenType machinery in the format. So I guess people will have to keep writing that manually or have font-specific scripts to handle them.

typoman commented 5 years ago

One thing that would still be unsolved is contextual mark positioning. I have no idea how that would be supported for anchors without full OpenType machinery in the format. So I guess people will have to keep writing that manually or have font-specific scripts to handle them.

This is the main limitation of anchors for me in UFO. A context cannot be written in the name. I would love to have the option to write it in the anchor but there is None atm.

typoman commented 5 years ago

A possible solution for future, adding an attributes dictionary to the anchor with (standardized) string keys?

typoman commented 5 years ago

A solution for now, write all the related info for mark and cursive positioning inside the glyph lib and define the spec. An authoring tool (e.g. RF extension) is required to show the preview, change its attributes and to compile it to features.

typoman commented 5 years ago

Another disadvantage for anchor naming scheme, the intention of the designer is unclear. The anchor naming scheme for generating mark positioning is exactly similar to build precomposed accented glyphs. Is the anchor made to generate mark positioning or just to build precomposed accented glyphs? This could lead to unnecessary extra data in the final binary. An anchor that is used for glyph construction is not necessarily supposed to be used for mark positioning and position of a precomposed mark can be different from a mark that is going to be placed using mark positioning on the actual base glyph (very common in Arabic). One solution, a new data structure only for generating marks and cursive attachment that could be saved in glyph lib and also to segregate these data.

justvanrossum commented 5 years ago

Is the anchor made to generate mark positioning or just to build precomposed accented glyphs?

That's an interesting question.

On the one hand: well, maybe it is cool to have mark features for (latin) accents? On the other: that seems rather accidental, and is not controllable enough.

But anchor.type could be used to distinguish them, if it existed, so this is nice argument in favor of that.

typoman commented 5 years ago

On the one hand: well, maybe it is cool to have mark features for (Latin) accents?

I'm also wondering. In Arabic, mark positioning is needed on any type of letter (whether precomposed or not) but in Latin:

How often letters+accent(s) combinations are typed?
Which layout engines override these combinations and replace it with a precomposed glyph that already exists in the font?

Glyphs app uses the anchor naming scheme to generate mark positioning and also to build composites. There are situations where a composite glyph doesn't have any anchors but Glyphs app burrows mark positioning from baseGlyph(s) and duplicates them in the composite during compile. The user even can't see these virtual marks positionings in the glyph view. The generated mark feature for Latin could become as huge or even bigger than Arabic. I wonder if this huge data could cause other issues? I would appreciate @behdad and/or @anthrotype insight into these questions.

typoman commented 5 years ago

In my experience, mark positioning is not equal to accented composites.

typoman commented 5 years ago

Since I don't know much about Devanagari I would also appreciate @tiroj insights. That would help to address any current issues with anchor definition or mark positioning in UFO.

benkiel commented 5 years ago

Speaking how I use anchors: I use the same anchor for precomposed positioning and to generate a mark feature (dot accents and the like, we include a feature for things that aren't precomposed, with a font having both spaced and zero width accents in the character set). That said, I can see where you wouldn't want the mark feature anchor to be the anchor for precomposed characters. I think with a anchor.type, there's enough flexibility to have both.

typoman commented 5 years ago

Thank you, Ben. I understand that anchor.type will address that issue and I think I will use it too. The reason I would personally still opt for a glyph lib solution is:

UFO implementation is not fast. We don't know when UFO 4 (if there is going to be any) will make it to RF for example. Maybe if there's going to be a UFO 3.1 or something that will happen soon I will consider adopting my tools to this new attribute.
I still would need more attributes to store more complex data structures for context and flags most notably. One attribute is not enough for my needs. I need a very explicit data structure without guesswork as @khaledhosny said. Glyph lib gives accessible data storage out of the box.
I can predict I would have multiple anchors overlaying on top of another or very close to each other because of small differences between anchors for mark anchors and precomposed glyphs anchors. If I go for glyph lib solution I can hide my mark anchors from the regular user. They can choose a tool to see my mark anchors only and in that mode, the precomposed glyphs anchors are hidden. I would have more control to hide these anchors to prevent making things messy.

benkiel commented 5 years ago

@typoman I understand why you would prefer to use your own solution in the glyph's lib. That's why it is there, and no one is taking that away or saying it's a bad use of the glyph lib.

We are, however, trying to sort out if a anchor.type would be useful, and if so what it would need. Are you saying that you don't think anchor.type is needed/useful? Or, are rather saying that no matter what you'd use glyph.lib? If the latter, we understand, but it's steering the conversation away from the merits and needs of a anchor.type.

typoman commented 5 years ago

Sorry, I was only trying to share my solution to others who have similar issues as mine, in case only one attribute is added. If you think there are advantages to adding more attributes (like context, flag, ligatureIndex) I would use anchor.type along with them because it's easier to keep all the data in one place. No matter if I use anchor.type I think there are already more advantages to it than just what I described.

typoman commented 5 years ago

~~One another advantage to anchor.type could be storing hinting information too?~~

tiroj commented 5 years ago

Since I don't know much about Devanagari I would also appreciate @tiroj insights. That would help to address any current issues with anchor definition or mark positioning in UFO.

I don't know much about anchor definition or mark positioning in UFO, so am unsure what the issues are. I can confirm that the variety of anchors used in Indic and SE Asian fonts tend to be distinct from any that might be used to build composites, but in general these scripts don't involve many composites anyway.

Contextual anchor attachment is very important for these scripts though (as it can be in some styles of Arabic). A good Devanagari example would be re-positioning of candrabindu, anusvara, and/or reph on bases when the base is preceded by an ikar vowel sign. Depending on the individual base width, the ikar variant width, and the standard anchor position of the mark on the base, the mark may need to shift to a different anchor in context of the ikar. I've not seen good visual handling of this in any tools. But that's a tool issue rather than a data format issue, I think?

typoman commented 5 years ago

I've not seen good visual handling of this in any tools. But that's a tool issue rather than a data format issue, I think?

If you're talking about VOLT, I think that's the best visualization tool for contextual positioning so far. I also think VOLT doesn't make it easy. Contextual mark positioning aside I still prefer the anchor naming mechanism in UFO because it's far more convenient to control it compared to VOLT. But lack of context in mark positioning tools is preventing designers to create more typographic diversity in complex scripts like Arabic (and probably Devanagari too).

behdad commented 5 years ago

lol. @typoman hijacked another thread. :)

typoman commented 5 years ago

I've been instructed to not to post a comment on anything I find relevant to the subject. I will do it in one huge post when I gathered all I could. Thank you all for your patience and sorry for distractions!

gferreira commented 5 years ago

Would Anchor become more usable if it had an optional type attribute, that contains a (standardized) string?

yes!

Regarding anchor.type: what are the needed values for such a field? base, mark, ligature, entry, exit

I can think of an additional category of anchors which are used during the design stage and are not needed in the generated font. examples:

anchors used for positioning ‘meta glyph parts’ (ex: making an ‘m’ out of a stem and two arches)
anchors used only for creating precomposed accented glyphs (and not used by any feature)
anchors which work as ‘markers’ for parametric shapes (think of variable bitmap fonts)
anchors to control dimensions beyond 2D space (HOI handles, warp mask, crazy things)

exploratory questions:

could arbitrary anchor types be allowed?
would it make sense to have something like skipExportAnchors somewhere?

moyogo commented 5 years ago

While FontForge sticks to one anchor type per anchor as its model is close to the OT model, Glyphs.app and fontmake/ufo2ft currently may assign more than one anchor type per anchor.

For example when circumflexcomb has only a mark-to-base mark anchor, FontForge will have just that. Glyphs.app and fontmake/ufo2ft will be happy with a _top anchor. When that circumflexcomb has a mark-to-base mark anchor, a mark-to-mark base anchor and a mark-to-mark mark anchor, FontForge will have all 3. Glyphs.app and fontmake/ufo2ft will be happy with only 2 anchors, _top for the mark-to-base mark anchor and the mark-to-mark mark anchor and top for the mark-to-mark base anchor.

typoman commented 5 years ago

What a MarkToMark base anchor is going to do? I've never heard of that one. MarkToMark is supposed to attach a mark to another mark. Why should it be written on a base glyph?

benkiel commented 5 years ago

For the above example, I think what @moyogo means for base is the mark that the other mark attaches to. That example is for when the mark attaches to a letter, has a mark that attaches to it, and attaches to another mark.

typoman commented 5 years ago

Then what is a difference between mark-to-mark base anchor and a mark-to-mark mark anchor?

tiroj commented 5 years ago

What a MarkToMark base anchor is going to do? I've never heard of that one. MarkToMark is supposed to attach a mark to another mark. Why should it be written on a base glyph?

Because the base is a diacritic carrying a precomposed mark?

Typically, my mark-to-mark anchors involve 2nd glyph y adjustments to obtain a consistent optical distance between the top of the 1st mark and the bottom of the 2nd mark for above marks (and vice versa for below marks), while my mark-to-base anchors use the default vertical alignment of the marks. Obviously the mark-to-mark anchor adjustments should also apply when the mark is applied to a diacritic glyph with a precomposed mark. So, e.g. applying acutecomb to ä would need the mark-to-mark adjustment of the height of the acutecomb from the mark-to-diaeresis mkmk anchor.

So either one needs to use the mark-to-mark anchor on the diacritic bases, or to have a third anchor for mark-to-diacritic.

benkiel commented 5 years ago

@moyogo for your Glyphs/ufo2ft example, is my assumption that the tool has some knowledge of what marks should be in the mark to mark feature correct? (this is the way I've done it in the past, just want to double check my assumption).

typoman commented 5 years ago

~~In circumflexcomb glyph example (which is a base glyph), there shouldn't be any MarkToMark positioning. Or am I wrong? I don't think it makes sense to have two types per anchor.~~

behdad commented 5 years ago

@typoman You should study the OpenType spec, in much more detail than you know currently.

In circumflexcomb glyph example (which is a base glyph),

What does "is a base glyph" mean / refer to here? "comb" in the glyph name means combining.

benkiel commented 5 years ago

(not a real letter, etc)

That's the case we're talking about.

typoman commented 5 years ago

Thank you for clarification, I think @moyogo terminology is different from what I've understood so far, ergo the confusion. But in my experience in Glyphs app and in a combining mark, any anchor mark in Adobe feature syntax will get one rule per anchor.

top anchor will get a mark rule in mkmk feature: pos mark [circumflexcomb] <anchor -207 670> mark @mark_top;
_top will get a markClass defintion: markClass [circumflexcomb] <anchor -207 448> @mark_top;

What's the third rule he's refering to? I don't see it in Glyphs app feature file.

unified-font-object / ufo-spec

Better anchor definition #32