Closed moyogo closed 3 years ago
The following should also be allowed in the font’s lib:
public.truetype.maxStorage
integer,public.truetype.maxFunctionDefs
integer,public.truetype.maxInstructionDefs
integer,public.truetype.maxStackElements
integer,public.truetype.maxSizeOfInstructions
integer.this is what RoboFont supports for a long time:
https://gitlab.com/typemytype/robofont_com/issues/12
the main idea was not to abstract binary data, but keep the data as close as to what fontTools requires. An authoring tool can generate, build this data while compiling a binary font. This is used by all hinting RoboFont tools.
@moyogo could you write up a PR for this? I think having it spec'ed out clearly would help the discussion, and I agree that having a way to store TT hints in UFO is good. We've been waiting for VTT to be open sourced as a higher level abstraction, but perhaps too long now.
I am wondering if a hash of the glyph order might be handy too? Certain TT instructions can refer to other glyphs in the font, and these references are defined by glyph index.
Therefore it is possible that certain TT instructions will not compile if the original glyph order has been changed.
@BoldMonday The proposed id-hash uses the outline of components. Would that be enough?
@benkiel I opened the merge request #94 with the lib.plist instructions-related keys in a single dict.
Can glyph IDs be referenced in controlValueProgram
or fontProgram
?
@BoldMonday What instruction do you have in mind?
@moyogo Thank you for doing this! I have a feeling this will spend a bit of time in back and forth to get it well pinned down, but before it starts feeling like death by a thousand comments: thank you so much for putting this PR together.
@anthrotype @khaledhosny @behdad I'm sure you'll want to comment/review, as will @typesupply
Is using a different format for TT assembly code out of scope for this discussion? I have recently been using htic to compile TT instructions. It uses a slightly different assembly text format, but conversion from FontTools-style assembly is trivial. The big advantage is that htic optimizes the code on compilation very effectively, on par with Visual TrueType's push optimization. An example:
o {
SVTCA[X]
MDAP[R] 40
MDAP[R] 20
SRP0 40
MDRP[M>RGr] 0
SRP0 20
MDRP[M>RGr] 10
SRP0 0
MDRP[M>RGr] 30
SRP0 10
MDRP[M>RGr] 41
SVTCA[Y]
CALL 10 5 10
CALL 10 15 6
SRP0 5
MIRP[M<RGr] 25 4
SRP0 15
MIRP[M<RGr] 35 4
IUP[Y]
IUP[X]
}
The original code looks like this:
<assembly>
PUSHW[ ] /* 1 value pushed */
40
MDAP[1] /* MoveDirectAbsPt */
PUSHW[ ] /* 1 value pushed */
20
MDAP[1] /* MoveDirectAbsPt */
PUSHW[ ] /* 1 value pushed */
40
SRP0[ ] /* SetRefPoint0 */
PUSHW[ ] /* 1 value pushed */
0
MDRP[11100] /* MoveDirectRelPt */
PUSHW[ ] /* 1 value pushed */
20
SRP0[ ] /* SetRefPoint0 */
PUSHW[ ] /* 1 value pushed */
10
MDRP[11100] /* MoveDirectRelPt */
PUSHW[ ] /* 1 value pushed */
0
SRP0[ ] /* SetRefPoint0 */
PUSHW[ ] /* 1 value pushed */
30
MDRP[11100] /* MoveDirectRelPt */
PUSHW[ ] /* 1 value pushed */
10
SRP0[ ] /* SetRefPoint0 */
PUSHW[ ] /* 1 value pushed */
41
MDRP[11100] /* MoveDirectRelPt */
SVTCA[0] /* SetFPVectorToAxis */
PUSHW[ ] /* 3 values pushed */
5 10 10
CALL[ ] /* CallFunction */
PUSHW[ ] /* 3 values pushed */
15 6 10
CALL[ ] /* CallFunction */
PUSHW[ ] /* 1 value pushed */
5
SRP0[ ] /* SetRefPoint0 */
PUSHW[ ] /* 2 values pushed */
25 4
MIRP[10100] /* MoveIndirectRelPt */
PUSHW[ ] /* 1 value pushed */
15
SRP0[ ] /* SetRefPoint0 */
PUSHW[ ] /* 2 values pushed */
35 4
MIRP[10100] /* MoveIndirectRelPt */
IUP[0] /* InterpolateUntPts */
IUP[1] /* InterpolateUntPts */
</assembly>
While htic's result is this:
<assembly>
NPUSHB[ ] /* 22 values pushed */
35 4 15 25 4 5 15 6 10 5 10 10 41 10 30 0 10 20 0 40 20 40
SVTCA[1] /* SetFPVectorToAxis */
MDAP[1] /* MoveDirectAbsPt */
MDAP[1] /* MoveDirectAbsPt */
SRP0[ ] /* SetRefPoint0 */
MDRP[11100] /* MoveDirectRelPt */
SRP0[ ] /* SetRefPoint0 */
MDRP[11100] /* MoveDirectRelPt */
SRP0[ ] /* SetRefPoint0 */
MDRP[11100] /* MoveDirectRelPt */
SRP0[ ] /* SetRefPoint0 */
MDRP[11100] /* MoveDirectRelPt */
SVTCA[0] /* SetFPVectorToAxis */
CALL[ ] /* CallFunction */
CALL[ ] /* CallFunction */
SRP0[ ] /* SetRefPoint0 */
MIRP[10100] /* MoveIndirectRelPt */
SRP0[ ] /* SetRefPoint0 */
MIRP[10100] /* MoveIndirectRelPt */
IUP[0] /* InterpolateUntPts */
IUP[1] /* InterpolateUntPts */
</assembly>
(For this specific example, you need to tell htic that function 10 leaves no value on the stack, or the push optimization will not optimize across function calls. In the worst case, the result is the same unoptimized code as before, but usually you get at least some optimization.)
Here is a more complete htic code example.
I am currently wiring htic up to ufo2ft#234 so FontLab hinting code from UFOs generated by vfb2ufo will compile into TTFs and stay 100% rendering-compatible with fonts generated directly from FontLab 5. The part that translates FL5 code to htic code is not public (yet). I'd be happy to see wider support for this.
I have recently been using htic to compile TT instructions.
I think this is a very interesting proposal. How stable is the htic language?
You mean in terms of changes to the syntax? It seems quite stable. Since I'm aware of it (~ 1 year?) there has only been one change, the removal of sub-blocks which would probably not have affected disassembled code.
My knowledge on this is extremely limited, so all I can offer are the general guidelines we used in the past for evaluating syntaxes:
I'm not saying that htic doesn't meet these requirements. In fact, I took a quick look at it and it seems extremely well organized and well documented, but, again, my knowledge on TT instructions is extremely limited. Just offering some evaluation advice...
@moyogo @justvanrossum Recently I had an email exchange with Greg Hitchcock who mentioned some VTT instructions that deal with components. Those instructions use glyph IDs. Not sure if these specifics apply to VTT instructions only or to their native TT instructions as well. I will dig around more next week when I'm back from in the studio.
@BoldMonday VTT code has instructions that do not translate into TrueType instructions, OFFSET
for example uses glyph indeces but isn’t translated into TrueType instructions but rather allows to change the Composite Glyph description.
IIRC only the "pseudo instruction" OFFSET[r|R]
in composite glyphs uses glyph IDs.
/* The base glyph ID 61 is at offset 0, 0, no need to set the rounding flag */
OFFSET[r] 61, 0, 0
/* Set the "use my metrics" flag on the base glyph component */
USEMYMETRICS[]
/* A diacritic is added from glyph 542 with offset 128, 256; set the rounding flag */
OFFSET[R] 542, 128, 256
USEMYMETRICS[]
is another pseudo instruction that influences the component flags.
Those have no equivalent in native TT instruction code. You need to set the component flags by some other means. I think most of the time a heuristic can be used (The first component with the same width as the composite glyph will get "use my metrics" set, any shifted components will get the "round" flag set, possibly only taking y-shift into account for modern y-hint-only fonts). FontLab sets the "round" flag on all components.
These are the VTT instructions mentioned by Greg Hitchcock that use a glyph index as parameter:
OFFSET[]
SOFFSET[]
ANCHOR[]
SANCHOR[]
But if there are no native TT instructions that rely on glyph indexes then there is probably no necessity to check the integrity of the glyph order.
Regarding referencing things by index: Could UFO have a slight-fork of the TTX or htic syntax that uses glyph names or identifiers (for points, components, etc.) in place of indexes? I know that the goal is to take an existing storage format and use it, but indexes are very un-UFO. The documentation could state that glyph names/identifiers would be replaced by the index at compile time.
htic already supports identifiers for points (and cvts, zones). It expects them to be defined in the htic input file, but a compiler could gather them from the glyph outlines ahead of compilation.
These are the VTT instructions mentioned by Greg Hitchcock that use a glyph index as parameter:
OFFSET[] SOFFSET[] ANCHOR[] SANCHOR[]
These are not even mentioned in the VTT language reference or VTT manual … :-/ It's really hard to get an exhaustive knowledge of this stuff.
If we want to use identifiers for zones and other values from the cvt, we would need a kind of "annotated" cvt format, not just an array of integers.
In the considerations to use htic, how is VTT relevant?
I think it was just to make sure we are missing no occurrence of glyph indices in TT code. Otherwise it is not relevant.
Could we make id
(the hash) in the glyph lib optional?
The calculation of the glyph hash in Adobe FDK looks rather complicated. It would make an initial implementation of the instruction processing easier if the id
could just be skipped.
If we can use named points for instructions, the instruction code would be fairly stable. If only on-curve points are hinted in a glyph, the instructions could even survive outline conversion from cubic to quadratic.
htic already supports identifiers for points (and cvts, zones). It expects them to be defined in the htic input file, but a compiler could gather them from the glyph outlines ahead of compilation.
Sorry, I was mistaken. htic does not support identifiers for points (only for functions, cvts, and instruction flags). So we do need strict outline checking before compilation.
I have taken a look at the AFDKO hash function. It is complicated and parses the UFO by itself. We should define a hashing function for glyph outlines to be used, preferably one with a simpler implementation than AFDKO.
I have taken a look at the AFDKO hash function. It is complicated and parses the UFO by itself. We should define a hashing function for glyph outlines to be used, preferably one with a simpler implementation than AFDKO.
PSAutoHint has a HashPointPen
which was written as a ufoLib replacement of AFDKO’s hashing function and should give the same hash.
I have implemented compilation for htic TrueType code as outlined in #94 in my fork of ufo2ft. The instruction compiler is called automatically in ufo2ft’s post-processing step if you have htic installed.
You can try and compile the attached file with fontmake
:
$ fontmake -u "IBM Plex Serif-Text.ufo" -o ttf --keep-overlaps --keep-direction --output-path "IBM Plex Serif-Text.ttf"
The only change from @moyogo’s proposal is that the controlValue
entry doesn't use a list of integers, but a string of htic code.
A full htic file is additionally saved inside the UFO package as instructions.hti
.
Thinking out loud: would it be useful for UFO to store either the TTX dump or htic or flexibility? I can see how that could both be useful and a nightmare; @moyogo did you have any thoughts about htic?
I forgot to mention: At the moment you also need my special version of htic for the compilation to work:
https://gitlab.com/jenskutilek/htic
And in the compiled example font there is a bug on the x-height at 18 ppm that I have to find and squash.
Here's an updated version of the demo ufo: IBM Plex Serif-Text.ufo.zip Some function definitions were wrong which caused htic to do push optimization where it was not possible.
BTW by doing the demo implementation already, I'm not trying to push for htic, it's just to show that it could work, and I had done some of the work before this discussion came up.
I would also be happy if the TT assembly representation in UFO matched the FontTools representation. Compiling with optimizations can be done in a separate step if desired. Including both versions in the UFO spec is probably too much trouble.
Sorry for not replying earlier.
I’m fine with htic
if it can do bytecode instructions round-tripping without optimization. Having the option to do optimization is nice but may have unwanted effects, so I wouldn’t want it all the time.
The only thing that is currently keeping htic
from simple roundtripping is the lack of support for the DELTAC1, DELTAC2, DELTAC3, DELTAP1, DELTAP2, DELTAP3 instructions. Only the "convenience instructions" deltac
and deltap
are supported. But that is an easy fix.
Do you need to preserve the push data sizes? Similar to the delta instructions, htic
only has a generic push
instruction that will choose the optimal instruction (PUSHB, PUSHW, NPUSHB, NPUSHW) at compile time.
In https://github.com/daltonmaag/vttLib/, I just went away from storing the VTT data dump in the UFO data directory to storing them in external single blobs because I want to support interpolated instances and potentially variable fonts. Storing that data in UFOs makes little sense to me.
This proposal works when you compile a single UFO to a single static TTF in a specific way. Is there a story for instance or variable font generation as well or is that out-of-scope? At least the latter is going to be difficult because a variable font will probably have overlaps (and may use extra instructions), a static font probably won't, and you potentially need different hinting code for both.
In the issue description I wrote the following:
It would be useful to have a standard way of storing bytecode TrueType instructions in UFOs, especially in the case of extracting UFOs from TTFs and being able to compile that back into TTFs. Or this could be useful when processing UFOs in a standard way with tools that can compile TrueType instructions.
Just as public.postscript.hints
, the public.truetype.instructions
would be ignored when the hash of glyph outlines doesn’t match anymore.
Ok, so my concern is out of scope. :)
Coming back to this, as it seems to have been a bit derailed by the talk of htic
: is the limitations that @jenskutilek mentioned a concern? I think getting something into the format for this would be good.
@moyogo and @jenskutilek: I'd like to get @moyogo's PR reviewed and merged: did you come to consensus on if htic
should be used or not?
Yes please.
Yes to htic
or yes to getting this merged after resolving that?
To be merged. Sorry on phone. Thanks.
@behdad I'm going to wait a bit to see what @moyogo and @jenskutilek say as it seems that they've been moving this forward a bit in other places.
From my read open concerns are
htic
do bytecode instructions round-tripping without optimizationhtic
or TTX instructions (seems no, but I might be missing a confirmation)Yes. I just meant I'm supportive of finishing and merging this. :)
- Can
htic
do bytecode instructions round-tripping without optimization
Not in its current state. It would require the addition of specialised PUSHB
, PUSHW
, NPUSHB
, NPUSHW
and DELTA[CP][123]
instructions instead of the generic push
and delta
, and an option to not group push and delta instructions. There's already an issue on htic for the instructions. Avoiding the grouping is easy, I think I have added that to my fork. Fixing this shouldn't be too hard. I can investigate this.
- Are Glyph IDs used anywhere in the
htic
or TTX instructions (seems no, but I might be missing a confirmation)
They aren't.
The other thing to consider is @typesupply’s suggestion to allow point identifiers instead of indexes. That would require changes in both TTX and htic. Implementing this in htic may involve more work, because as it is, the htic compiler doesn't know anything about the glyphs (and thus cannot look up any point identifiers). A hint authoring tool would have to provide a mapping of identifier to index to the compiler, similar to what is already possible for cvts:
cvt {
32 zones_off # CVT 0
500 x_height # CVT 1
700 2 # index instead of label
}
Named points sound like a good idea, but I can't judge how useful they really are:
A hinting tool could use a mapping internally and write out point indices to htic code. Though htic code can be used directly, I would expect it is more likely to be used as an intermediate step between a high-level language and pure TT assembly code/bytecode. For roundtripping, it's hard to add point labels because it is not obvious which numbers in the decompiled bytecode correspond to point indexes.
Anyway, let's do this :)
The other thing to consider is @typesupply’s suggestion to allow point identifiers instead of indexes.
If this is an impediment to implementing, stick with indexes for now. If outline sync becomes an issue, identifiers could be implemented in a version 2. Perhaps store a format version of the instructions should the need to modify the syntax arise.
We may need to add component flags to UFO components (round, useMyMetrics, overlap). Though in most cases those can be set heuristically when compiling the font. Maybe that's also something for a future version? Or those flags could be added to the glyph lib instead of directly to the component.
@jenskutilek how does glyph.components
fall down here?
Glyph component flags are part of the TrueType hinting, I'd argue. At least the "ROUND" flag has influence on the rasterization. To be able to roundtrip the flags, they need to be stored in the UFO somewhere. Currently there is no place to store them.
In ttx, they are stored as hex numbers:
<TTGlyph name="Aacute" xMin="16" yMin="0" xMax="673" yMax="957">
<component glyphName="A" x="0" y="0" flags="0x200"/>
<component glyphName="acute.case" x="45" y="0" flags="0x4"/>
</TTGlyph>
In UFO, something like this might be more elegant:
<?xml version='1.0' encoding='UTF-8'?>
<glyph name="Aacute" format="2">
<advance width="689"/>
<unicode hex="00C1"/>
<outline>
<component base="A" useMyMetrics="true" overlap="false"/>
<component base="acute.case" xOffset="45" round="true"/>
</outline>
</glyph>
Absence of the flags could mean that the compiler should deduce them.
To be able to roundtrip the flags, they need to be stored in the UFO somewhere.
Hm. This could work well with another idea that I have been thinking about...
There has been some interest in adding a lib
attribute at the API level in defcon and fontParts for objects that don't currently have a lib. contour.lib
, component.lib
, etc. Putting that on objects is no problem, but we'd need to store it in UFO and that is much more difficult. GLIF readers/writers are hardwired to the current GLIF schema and those will need to be updated to handle new elements and attributes. The structure of ufoLib.glifLib makes it pretty easy to do this, but the other readers/writers out there are out of our hands. I've been thinking that we could handle it this way until we feel comfortable making a major format change. In the GLIF <lib>
element, establish some new public keys:
public.contour.lib
public.component.lib
public.etc.lib
Each of these keys would have a dict as their value. The dict keys would be the object identifiers and the values would be dicts.
At the ufoLib.glifLib level or even the defcon level, when reading GLIF these parts of the lib would be popped from the dict and set as the lib of the object with the matching identifier.
So, for this use case, the component flags would be located here in the defcon and fontParts API:
component.lib["public.htic"]["useMyMetrics"]
component.lib["public.htic"]["round"]
component.lib["public.htic"]["overlap"]
In the UFO the data would be stored in the GLIF like this:
<lib>
<dict>
<key>public.component.lib</key>
<dict>
<key>component1</key>
<dict>
<key>public.htic</key>
<dict>
<key>useMyMetrics</key>
<true/>
<key>overlap</key>
<false/>
<key>round</key>
<true/>
</dict>
</dict>
</dict>
</dict>
This could be implemented pretty quickly and be backwards compatible without risk of data loss.
This makes sense and agree it should be added (the .lib
s)
Could we add a lib keys to hold fontTools TTX instructions?
This would be like #42 but for TrueType instructions. It would be useful to have a standard way of storing bytecode TrueType instructions in UFOs, especially in the case of extracting UFOs from TTFs and being able to compile that back into TTFs. Or this could be useful when processing UFOs in a standard way with tools that can compile TrueType instructions.
It could use the following structure: In the glyph’s lib:
public.truetype.instructions
as a dict:formatVersion
string "1"id
string Hash of glyph outlines (similar topublic.postscript.hints
id)instructionList
string TTX assemblyIn a font’s lib:
public.truetype.fontProgram
string TTX assembly,public.truetype.controlValuesProgram
string TTX assembly,public.truetype.controlValues
as an array of integers.