Closed eric-hemasystems closed 2 years ago
Yes, font.on_missing_glyph
is exactly for such a situation.
What you can do to have access to the document instance is setting the option after creating the document:
doc = HexaPDF::Document.new
doc.config['font.on_missing_glyph'] = ... # some proc with access to doc
The Glyph
class is private, yes. But you can get instances via the #glyph
method if you know the correct ID (or name in case of Type1 fonts). Since you need knowledge of the internals, the API docs say that this method should probably not be used. Better use the #decode_utf8(string)
method to get an appropriate glyph object.
You could do it like this:
font = doc.fonts.add(...)
.glyph = font.decode_utf8("?").first
doc.config['font.on_missing_glyph'] = proc { glyph }
As for the issue with modifying the DefaultDocumentConfiguration
, you are right, the document instance would be needed there to facilitate the setting of the behaviour. Currently the third argument passed to the block is the wrapped font, ie. nothing directly associated with a document. However, as per your explanation it would make more sense to actually use the wrapper font. Then you can easily use the #decode_utf8
if needed or access the document via wrapper_font.pdf_object.document
. I will change the behaviour for the next release.
Let me know if that helps!
I attempted this. Here is how I'm creating and configuring my document:
HexaPDF::Document.new(io: background_io).tap do |doc|
font = doc.fonts.add 'Source Sans Pro' # Embed font into document
# Replace any unknown glyphs (such as tab) with a question mark
unknown_glyph = font.decode_utf8('?').first
doc.config['font.on_missing_glyph'] = proc { unknown_glyph }
end
background_io
is a StringIO
object of an existing PDF (download from ActiveStorage). With this in place I still get the missing glyph error. Furthermore I replaced the proc with just a debugger statement:
doc.config['font.on_missing_glyph'] = proc { debugger }
When I ran my test it never stopped in the callback. If I do the same thing globally:
HexaPDF::DefaultDocumentConfiguration['font.on_missing_glyph'] = proc { debugger }
it does stop at my breakpoint but of course now I don't have the document reference. I also tried this which does stop at my breakpoint:
HexaPDF::Document.new(io: background_io, config: { 'font.on_missing_glyph' => proc { debugger } })
But of course this doesn't work because this doesn't give me a chance to create the glyph from the font that is tied to the document.
I dug into this a bit and I think it due to some refactoring. With that commit the proc was no longer pulled from the config at runtime when the callback was executed but when the font is initiaized.
My work-around is to create the closure before the font is embedded but then after embedding that font update the variable captured by the closure with the desired character.
@document ||= HexaPDF::Document.new(io: background_io).tap do |doc|
unknown_glyph = nil
doc.config['font.on_missing_glyph'] = proc { unknown_glyph }
font = doc.fonts.add 'Source Sans Pro' # Embed font into document
unknown_glyph = font.decode_utf8('?').first
end
This works for now but if you get in that code to make the global config work perhaps we can ensure if the config is set after the font is defined that it will still use the config.
I dug into this a bit and I think it due to some refactoring. With that commit the proc was no longer pulled from the config at runtime when the callback was executed but when the font is initiaized.
You are right, this is indeed the problem. I will fix this in a bug fix release in the next few days. Thanks for the debugging! :pray:
@eric-hemasystems I have changed the block signature for the font.on_missing_glyph
configuration option and fixed the caching bug. Will release later today or tomorrow.
@eric-hemasystems New version 0.21.0 is out with the fixes.
I am occasionally getting a "Glyph for is missing" error. Your documentation where it says:
helped me determine if I want a fuller range of character support I need to embed my own TrueType font. This seems to have resolve a lot of the issue but user data still occasionally has characters that don't have a glyph in my font such as a tab character.
Right now I am handling this by just squashing the error and none of the content is placed on the PDF. Ideally I would like to keep any user content that does have a glyph. Either replace the invalid chars with an empty string OR possibly a
?
to indicate a unsupported character was supposed to go here.The
font.on_missing_glyph
seems perfect for that. Instead of returning aHexaPDF::Font::InvalidGlyph
object (which triggers the exception) I have have it return my glyph of choice. The problem is I don't know what object that is. Based on the code it seems like it should be aHexaPDF::Font::TrueTypeWrapper::Glyph
object. But that is a private class.Even if I use
const_get
to bypass the protection I'm not sure what I pass for the constructor arguments. I tried some things like :question (based on this glyphlist), or?
or even 63 (ascii code for question mark) all to no avail.The
HexaPDF::Font::TrueTypeWrapper#glyph
method looks promising but the documentation indicates it should not be used by an application even though public. Plus to create an instance ofHexaPDF::Font::TrueTypeWrapper
I need aHexaPDF::Document
which I don't have when configuring thefont.on_missing_glyph
on theDefaultDocumentConfiguration
.I must be missing something as I cannot seem to figure out what
font.on_missing_glyph
should return if we don't want the default behavior of returning aHexaPDF::Font::InvalidGlyph
object.