Closed jslabovitz closed 9 months ago
So first: Very cool for using Harfbuzz for shaping/positioning! :+1: The current shaping mechanism in HexaPDF only supports basic ligatures but using a real shaping engine like Harfbuzz has been on my TODO list for a long time. Are you using a Ruby interface for this or something else?
HexaPDF only supports TrueType fonts at the moment, so those OpenType fonts that contain a glyf
table. Since I found that many fonts are available in both formats, ie. a version with glyf
table and another one with CFF
, there was no urgency in implementing the later tables. CFF
tables are also a bit more complicated to parse and integrate. It might be possible to make use of ttfunk of the Prawn object which recently gained supported for CFF
type fonts but I haven't looked into this.
Note that .otf
may contain either CFF
tables or glyf
tables but usually contain CFF
tables.
So: You can't use CFF
type OpenType fonts with HexaPDF at the moment. There are plans to implement support but this is a long term goal.
I'm quite surprised that using a CFF
type font without subsetting works since the PDF objects created for the font tell the PDF viewer that it is a TrueType font. Therefore this may be coincidence and not work across all viewers.
Are you sure that those experiments use a OTF font with a CFF
table and not a glyf
table?
Hmm. I'm using macOS's Preview app to view the generated PDFs. Perhaps there is some magic fallback logic in macOS that allows these PDFs to work, even if they aren't entirely valid?
Unfortunately about 95% of my OpenType fonts are CFF format, not TT (glyf) format. And I've been maintaining this collection for ~20 years, so I'm not likely at this point to replace my ancient CFF fonts with TT versions. :-)
To experiment, I found an OTF font with a glyf table (https://www.fontsquirrel.com/fonts/Aller), and happily HexaPDF handles it just fine, with no crash on subsetting.
I might be up for helping to parse/subset the CFF data for embedding & subsetting. In the past, I wrote a large part of an OpenType parser/shaper (in Ruby), though abandoned it once I found Harfbuzz.
Speaking of Harfbuzz (and apologies for leading this issue into digressions): Several years ago, I wrote a basic gem to interact with the Harfbuzz library, using FFI. It's on Rubygems/github, and in fact I just updated it to fix a few things: http://github.com/jslabovitz/harfbuzz-gem
I spent some time in the last few days writing a small script that first writes basic text using the HexaPDF API, then using Harfbuzz. Unfortunately it's not exactly compatible with your current TextShaper
class, as Harfbuzz needs pre-decoded UTF8 strings to do its shaping. But it should give you the basic idea of how shaping using Harfbuzz might work.
require 'harfbuzz'
require 'hexapdf'
def harfbuzz_shape(text, fragment, font_file:)
font = fragment.style.font
font_size = fragment.style.font_size
#FIXME: use +/- for true/false
features = fragment.style.font_features.select { |k, v| v }.keys.map(&:to_s)
#FIXME: try to get font_file from the wrapped_font object, and cache the face for efficiency
hb_face = Harfbuzz::Face.new(File.open(font_file, 'rb'))
hb_font = Harfbuzz::Font.new(hb_face, font_size)
buffer = Harfbuzz::Buffer.new
buffer.add_utf8(text)
buffer.guess_segment_properties
Harfbuzz.shape(hb_font, buffer, features)
buffer.normalize_glyphs
glyph_infos = buffer.get_glyph_infos
glyph_positions = buffer.get_glyph_positions
fragment.items = []
glyph_infos.each_with_index do |info, i|
position = glyph_positions[i]
advance = hb_font.glyph_advance_for_direction(info.codepoint, Harfbuzz::HB_DIRECTION_RTL)
kern = advance - position.x_advance
fragment.items << font.glyph(info.codepoint)
fragment.items << kern unless kern == 0
end
end
# https://www.fontsquirrel.com/fonts/Aller
font_file = '/Users/johnl/Fonts/A/Aller/Aller_Rg.ttf'
size = 100
text = 'WAVE first!'
doc = HexaPDF::Document.new
wrapped_font = doc.fonts.add(font_file)
style = HexaPDF::Layout::Style.new(
font: wrapped_font,
font_size: size,
font_features: { kern: true, liga: true })
canvas = doc.pages.add([0, 0, 1000, 1000]).canvas
# standard text
fragment = HexaPDF::Layout::TextFragment.create(text, style)
# pp fragment.items
fragment.draw(canvas, 0, size * 2)
# Harfbuzz text
fragment = HexaPDF::Layout::TextFragment.create(text, style)
harfbuzz_shape(text, fragment, font_file: font_file)
# pp fragment.items
fragment.draw(canvas, 0, size * 1)
doc.write('/tmp/out.pdf')
Regarding OpenType CFF support: The https://github.com/prawnpdf/ttfunk/ gem which is used by Prawn recently got support for OpenType fonts. So adding CFF support to HexaPDF via ttfunk might not be that hard but I haven't looked into it. All the code concerning CFF seems to be in https://github.com/prawnpdf/ttfunk/tree/master/lib/ttfunk/table/cff. The integration into Prawn is at https://github.com/prawnpdf/prawn/blob/master/lib/prawn/fonts/ttf.rb.
Note, however, that Prawn does font embedding a bit differently because they don't use composite fonts but simple PDF fonts. Hmm... and I just saw this: https://github.com/prawnpdf/prawn/blob/master/lib/prawn/fonts/ttf.rb#L367 where they hardcode the type to TrueType... So I'm not sure that Prawn itself already supports using OTF with CFF tables...
Thanks for info about Harfbuzz and the code example! Will put that on my TODO list to have a deeper look later.
@jslabovitz I'm closing this issue since the original question about OpenType font support has been answered.
Adding OpenType font support as well as providing integration with harfbuzz for better (and more correct) glyph positioning is on my ToDo list - thanks again for the pointer!
I'm writing a typography-oriented application that depends on OpenType fonts. I'm actually using Harfbuzz to shape/position the text into glyphs, then drawing the glyphs using
#show_glyphs
. My initial experiments are a success — as long as I turn off subsetting. If I don't turn off subsetting, HexaPDF crashes while accessing theglyf
table, which as I understand may not exist for all (any?) OpenType fonts. My minimal code & resulting crash:Is it indeed true that I won't be able to use OpenType fonts in HexaPDF, and still be able to subset them? If so, do you have plans to add this?