pdf-raku / PDF-Class-raku

PDF Document Object Model (under construction)
Artistic License 2.0
7 stars 3 forks source link

core fonts and single byte encodings - not a comfortable fit #1

Closed dwarring closed 7 years ago

dwarring commented 9 years ago

The plan is to encode core fonts using single byte encodings. To this end, Font::AFM calculates string-widths for a 0..255 (latin1) subset only.

The core fonts use an ExtendedRoman character set, which goes well past this. Consider:

% perl6 -MFont::Metrics::helvetica-bold -M Font::AFM -e 'my $hb = Font::Metrics::helvetica-bold.new; say $hb.Wx.keys (-) @Font::AFM::ISOLatin1Encoding'
set(Kcommaaccent, OE, guilsinglright, Umacron, radical, Sacute, oe, ecaron, dcaron, ohungarumlaut, perthousand, amacron, dagger, Nacute, Tcaron, notequal, lslash, quotesinglbase, Dcroat, lacute, fi, cacute, Ecaron, ncommaaccent, Zacute, umacron, ccaron, Aogonek, ncaron, zacute, nacute, summation, Amacron, Ncommaaccent, Ccaron, florin, lozenge, abreve, emdash, sacute, commaaccent, Emacron, scaron, endash, partialdiff, Ohungarumlaut, ellipsis, Rcaron, quotedblleft, zdotaccent, zcaron, Rcommaaccent, Lacute, scedilla, Ydieresis, fl, quotedblbase, Scaron, uring, edotaccent, omacron, tcaron, Zcaron, Omacron, Edotaccent, Racute, Euro, uhungarumlaut, racute, guilsinglleft, aogonek, lcaron, lessequal, greaterequal, tcommaaccent, rcommaaccent, gbreve, dcroat, Uogonek, Uhungarumlaut, Tcommaaccent, Zdotaccent, Lcommaaccent, Uring, trademark, Delta, Scedilla, emacron, imacron, Dcaron, Lcaron, Scommaaccent, Imacron, Lslash, Gcommaaccent, Cacute, Gbreve, quotesingle, gcommaaccent, Idotaccent, fraction, bullet, Ncaron, Eogonek, eogonek, quotedblright, lcommaaccent, iogonek, rcaron, kcommaaccent, scommaaccent, Iogonek, daggerdbl, Abreve, uogonek)

That's an extra 115 glyphs that fall outside of the latin-1 subset.

Will need a solution in the long term. I'm not sure if there's a nicer solution than moving to Identity-H.

dwarring commented 9 years ago

Some sort of adaptive solution maybe? I've already started adding support for Mac encoding etc. 7363379. There are 32+ unused bytes that could be adaptively remapped.

dwarring commented 7 years ago

Bigger issue is we need to support unicode fonts.