googlefonts / compreffor

A CFF table subroutinizer for FontTools
Apache License 2.0
23 stars 22 forks source link

Consider SEQUITUR(Nevill-Manning) or Re-pair compression algorithm #22

Open be5invis opened 8 years ago

be5invis commented 8 years ago

cf. http://www.sequitur.info/ https://arxiv.org/pdf/cs/9709102v1.pdf

be5invis commented 8 years ago

The SEQUITUR is an algorithm that forms a CF Grammar from an input string in linear time. I think it can be used in CFF subroutinization with some slight modifications:

  1. SEQUITUR is used to compress one string, but CFF CharStrings are multiple characters. Solution: in the two-component searching part, when any one in them is ENDCHAR, than add it directly to rule S.
  2. The recursion depth and subroutine quantity limit. May be work-arounded by expanding some subroutine calls into their expanded form.