fonttools / fonttools

A library to manipulate font files from Python.
MIT License
4.3k stars 451 forks source link

Unpack binary strings #564

Open davelab6 opened 8 years ago

davelab6 commented 8 years ago

Some sfnt values are presented as binary strings, eg

    <macStyle value="00000000 00000001"/>
    <fsType value="00000000 00000000"/>
    <fsSelection value="00000000 00100000"/>
    <ulCodePageRange1 value="00100000 00000000 00000000 10010011"/>
    <ulCodePageRange2 value="00000000 00000000 00000000 00000000"/>

It would be helpful to human readability, and newbie learning curves, if these were unpacked.

The OpenType specification is unambiguous about what each digit means in human language terms.

The panose values are unpacked, like this:

    <panose>
      <bFamilyType value="0"/>
      <bSerifStyle value="0"/>
      <bWeight value="8"/>
      <bProportion value="0"/>
      <bContrast value="0"/>
      <bStrokeVariation value="0"/>
      <bArmStyle value="0"/>
      <bLetterForm value="0"/>
      <bMidline value="0"/>
      <bXHeight value="0"/>
    </panose>

It could go further for human readability, though. §2.3 of http://monotype.de/services/pan2 is clear that the values of Weight are

0-Any
1-No Fit
2-Very Light
3-Light
4-Thin
5-Book
6-Medium
7-Demi
8-Bold
9-Heavy
10-Black
11-Extra Black

So this would be better for humans as,

    <panose>
      <bFamilyType value="Any"/>
      <bSerifStyle value="Any"/>
      <bWeight value="Bold"/>
      <bProportion value="Any"/>
      <bContrast value="Any"/>
      <bStrokeVariation value="Any"/>
      <bArmStyle value="Any"/>
      <bLetterForm value="Any"/>
      <bMidline value="Any"/>
      <bXHeight value="Any"/>
    </panose>
Arno-Enslin commented 8 years ago

I have written a batch file, that comments the ttx files. And I would prefer, if your wish would be solved with the help of an option. But as default it is good as it is in my opinion, because it is easier to understand the OT specification then.

Edit: I also comment the name table with the help of the batch file.

behdad commented 8 years ago

I have written a batch file, that comments the ttx files. And I would prefer, if your wish would be solved with the help of an option. But as default it is good as it is in my opinion, because it is easier to understand the OT specification then.

Edit: I also comment the name table with the help of the batch file.

What does this have to do with this issue? Please help us keep noise in our projects low. Thanks.

behdad commented 8 years ago

Dave, I'm not excited about changing file format and risk making the gods angry again! ("but sometimes it's useful to see the binary value" :D). If someone wants to offer a patch that is backward compatible, I believe we can consider that.

Arno-Enslin commented 8 years ago

What does this have to do with this issue? Please help us keep noise in our projects low. Thanks.

Damn! My batch file does in principle, what davelab6 has requested. It adds explanations to the TTX file. With the help of the comments it is much easier to edit.

behdad commented 8 years ago

What does this have to do with this issue? Please help us keep noise in our projects low. Thanks.

Damn! My batch file does in principle, what davelab6 has requested. It adds explanations to the TTX file. With the help of the comments it is much easier to edit.

Perhaps you can include a sample before/after to that issue? I looked at the batch file briefly but could not understand what it does.

behdad commented 8 years ago

Ok, now I see you are adding a few specific comments. Surely we can add those if you paste before/after.

Arno-Enslin commented 8 years ago

At the moment it comments the name table according to the old Adobe naming convention. Comment TTX.zip

davelab6 commented 8 years ago

Having the 'human readable' or unpacked version of the binary string in an XML comment is a perfect solution! :)

justvanrossum commented 8 years ago

I personally would have no issue in changing the TTX output at this point, as long as the existing Python API would stay the same.

moyogo commented 8 years ago

For PANOSE, it’s really messy. Depending on what the value of bFamilyType is, the following elements’ tags don’t make sense anymore. For exemple if bFamilyType is "Latin Hand Written", then bSerifStyle defines the Tool Kind, not the Serif Style, and its possible values, besides 0="Any" and 1="No Fit", are completely different. One would have to look up the possible values anyway. So having as integer doesn’t change much and the description can be left as a comment.

Arno-Enslin commented 8 years ago

Maybe the PANOSE section could be commented in a pragmatic way. In the past I had two times trouble with fonts, in which the PANOSE values were incorrectly set. But it seems, that not all values have to be set absolute correct. So the comment could primary provide recommendations with the help of which you can avoid trouble with the font with regard to installation, font menus, style links and so on. Comments, with the help of which you can fix bugs, that can have grave practical consequences in applications.

HinTak commented 8 years ago

It is really two issues, as Behdad hinted at - besides unpacking, one needs to simultaneously implements the re-packing routine, otherwise you end up having an xml output which you cannot edit and re-process. And it will be messy, and even more messy in general if you want to do the same unpacking/repacking for other binary strings.