sciurius / perl-HarfBuzz-Shaper

Perl extension to use HarfBuzz for text shaping
0 stars 1 forks source link

Difference in sizes #4

Closed PhilterPaper closed 4 years ago

PhilterPaper commented 4 years ago

When you were playing with HarfBuzz and designing Shaper, did you encounter differences in reported dimensions? I am using, say, Times 20 pt and the advance width reported by wxByCId() is often about .02 pt longer than your 'ax' width. That corresponds to a 'kern' of 1 for the TJ operator. Is this something that is an internal rounding error in HarfBuzz, or in your wrapper? I'm trying to account for why the slight difference, which bloats the PDF file quite a bit (multiple tiny kern values).

I'm assuming that a Reader will be using the widths from the font file, one after the other (hex codes in <>). Therefore the line of text will be a little longer than HarfBuzz::Shaper claims unless I put the small kern values (with TJ operator) on many glyphs, and allow for this when calculating the advancewidth. Finally, Shaper does real kerning, which I have to be careful not to stomp on. That's why it would have been great to have the true advance width and a separate kern amount (I mentioned this earlier).

sciurius commented 4 years ago

I just pass through the values I get from harfbuzz. Do you have a small test case that shows the discrepancy?

PhilterPaper commented 4 years ago

If HarfBuzz is handling the values as numeric (floating point?) internally, and returning them as numbers, maybe the XS stuff is rounding or truncating (e.g., newSViv() call) when getting it back to Perl format?

HarfBuzz does reduce ax in order to do kerning (place the next glyph closer than it naturally would be). My assumption is that a Reader would be using the font file's widths (and that wxByCId() is that value), and over the course of a line a slight but noticeable error could build up. I noticed this when implementing underline and strikethrough functions -- the line runs a little too long and I'm trying to figure out why. It's not a huge error, but I'd like to know what's going on.

I will send HarfBuzz.pl (example) and the latest Content.pm directly to you. It will output the ax and wx (cw) values. I think I'm going to rewrite some of the code that deals with dx/dy placement when dy is not 0.

PhilterPaper commented 4 years ago

It turns out if I disable kerning ( -kern or $dokern=0;), the mismatch between ax and wxByCId() seems to go away. Apparently it's just doing microkerning to an obsessive level. I'll have to think about ignoring kern requests (ax<cw) if very tiny, to reduce PDF size and rendering time.

sciurius commented 4 years ago

I love self-solving issues ☺

BTW, harfbuzz uses integer arithmetic. To get accurate results you need to specify a font size (for the rendering algorithms) and a scaling factor (for precision).