galkahana / PDF-Writer

High performance library for creating, modiyfing and parsing PDF files in C++
http://www.pdfhummus.com
Apache License 2.0
901 stars 217 forks source link

FreeTypeType1Wrapper::GetGlyphForUnicodeChar() does not actually get a glyph for a Unicode character #63

Open TheGS opened 8 years ago

TheGS commented 8 years ago

We convert to strings of UCS2 codepoints and call FreeTypeFaceWrapper::GetGlyphsForUnicodeText() regardless of the font, and the implementation of FreeTypeType1Wrapper::GetGlyphForUnicodeChar() wasn't actually getting the desired glyph for Type 1 fonts, nor was it reporting if the font did not have a glyph for the given character. Our solution was to convert the UCS2 codepoint to a Postscript glyph name, check that the type 1 font provided a charstring for that glyph name, and then look up the glyph number for the glyph name in the font's private encoding. This let us use the same input to display text as we would for any TrueType or OpenType font, and also let us determine if we needed to switch to a different font (if the original font didn't provide the needed glyph). Sorry, I don't have code to share at the moment. We created our map of UCS2 to Postscript glyph names using data from here: https://github.com/adobe-type-tools/agl-aglfn/

galkahana commented 8 years ago

I dont understand, was there a problem with utf8? This what hummus is expecting

TheGS commented 8 years ago

Code in hummus translates utf8 to lists of UCS2 codepoints anyway (in several places in PDFUsedFont, etc.) before calling FreeTypeFaceWrapper::GetGlyphsForUnicodeText() itself, and we wanted some extra control over the output to do our own vertical glyph substitutions, and our own font fallback, etc. before calling Tj() with a glyph mapping list. Calling Tj() with a utf8 string, calls PDFUsedFont::TranslateStringToGlyphs() , which calls UnicodeString::FromUTF8(), and then calls FreeTypeFaceWrapper::GetGlyphsForUnicodeText(). The point of this issue is that the glyph given for a Unicode character by FreeTypeType1Wrapper::GetGlyphForUnicodeChar() is simply the character code itself if the font has a private encoding, and the issue with this is that the private encoding is not necessarily Unicode, and for older fonts, may be limited to 256 entries

galkahana commented 8 years ago

Yeah. Sounds like what I did when I wrote a ps engine back when. Tell you the truth I got bored doing it again for PDF-writer, so I relied on free type to do the job, and provided the direct glyph access for whomever is not happy. I think that's as far as I go. Let's leave this open so that others can see the solution that you used. Thank you