Add CFF core classes and data structures

camertron commented 6 years ago

This pull request is part of a larger effort to bring OTF support to TTFunk. See https://github.com/prawnpdf/ttfunk/issues/53 for details.

Highlights of this PR:

The two CFF data structures added in this PR are Index and Dict. An index is essentially an array that can be accessed by index, and a dict is a dictionary of operator/operand (i.e. key/value) pairs. Operators have special meaning depending on the context. For example the CFF top and private dicts both use the dict structure but define different operators for the data they contain.
The Dict class technically isn't used anywhere yet. Including it when it is actually used would make the corresponding PR too large in my opinion, so it is introduced here instead.
The Sci class (couldn't come up with a better name, suggestions welcome) is supposed to represent a number in scientific notation (i.e. 5.2 x 10⁴). The Dict class uses it for a special type of operand that starts with a byte 30. I could have stored it as a float, but it would then have been a lot more work to re-encode later.
The TTFunk::SubTable class has been introduced to handle the case when table-like parsing behavior is desired but the table is not a top-level, directory-based font table.
The beginnings of an incomplete CFF table are also present. Currently only the header and name index sub tables are supported, more coming soon. CFF is by far the most complicated table TTFunk supports to date and the centerpiece of the open-type font format.

pointlessone commented 6 years ago

Could you please provide a link to spec. This is a rather logic-heavy PR. I'd like to at least try and verify the logic as well as heaps of magic numbers.
I'd rather leave Dict until it's actually used. PRs are as big as they need to be. You can use commits within a PR to show logical progression and further split changes into smaller pieces.
Sci is indeed a bit vague. Maybe SkiForm? It's not much better but at least resemble a noun and maybe a bit more telling to people who are familiar with a proper term.
Reasonable. Though, I'd expect a more generically named Table would be useful for all table-like structures and more specialized subclasses would implement specifics of top-level tables. Maybe we should consider this as an intermediary refactoring step either before or after this PR. WDYT?
Is the CFF table functional within the PR?

camertron commented 6 years ago

This is the document I used to implement the CFF table.
Ok, that's fine.
Hmm, maybe SciForm? The math world has to have invented a word for this, but I don't know what it is. I'll keep thinking about it.
I see your point. I think the original authors created the Table class to correspond to the True-Type concept of a table, which is a structure that has an entry in the font's directory. The structures within the CFF table aren't truly tables, or at least not according to the spec. So perhaps the right move here is to rename SubTable to something like CFFStructure.
Well... that's a complicated question. I don't believe the CFF table is valid with only a header and name index. In addition, subsetting OTF files at the moment will break because the glyf table doesn't exist for CID-keyed fonts. The last branch, otf_encoder, handles encoding OTF fonts and takes the missing glyf table into consideration.

Just a reminder that we're merging everything into the otf branch not master, so there's no danger of messing things up by merging incomplete features. I appreciate what you said about PRs being as large as they need to be, but I really don't see the need for the otf branch to be fully working after every one of my smaller PRs. It will be fully working when they have all been merged. I split things up to make the whole feature conceptually easier to understand. Just as I have submitted PRs for individual tables like VORG and DSIG, these upcoming CFF PRs introduce a few sub-tables (i.e. CFF structures) at a time. If you're ok with the entire CFF feature landing at once, I can create a single giant PR for the remaining work. A valid CFF table contains 5 top-level structures.

pointlessone commented 6 years ago

This PR is rather heavy on complex logic but specs are sparse. Maybe you could add a bit more coverage?

camertron commented 6 years ago

Hey @pointlessone, I think I've addressed all your concerns specifically I have:

Added specs for Dict and Index (caught several bugs too hehe).
Updated Index to parse everything immediately, which simplified the logic considerably.
Added validation of Dict operands (specifically sci form operands since it's not really possible to validate integers).

prawnpdf / ttfunk

Add CFF core classes and data structures #59