Open stechio opened 3 years ago
Thanks @stechio for the detailed write-up. The trouble is that we don't know at this point if the matched font will contain all the characters needed. So just not adding serif would change behaviour. Consider font-family: 'KoreanCharsOnly'
which then uses Latin characters.
To know if we are actually using serif we could run the font run divider again but this is extremely slow due to using exceptions from PDFBOX to check whether a character is in the given font.
I think what I need to do is:
getWidth
along with the width so this can be stored in the line break context and inline text objects and then passed back to the text renderer to actually render text or get the font metrics.I'll see if I can have a go at the first item soon.
I noticed that the font resolution algorithm stubbornly adds the
serif
built-in font, no matter if the selected font families have already found an appropriate built-in match (seecom.openhtmltopdf.pdfboxout.PdfBoxFontResolver.resolveFont(..)
):That behavior adversely affects the font metrics calculation (see
com.openhtmltopdf.pdfboxout.PdfBoxTextRenderer.getFSFontMetrics(..)
) and, in turn, the actual text placement (seecom.openhtmltopdf.layout.InlineBoxing.calculateInlineMeasurements(..)
), as parameters likeascent
anddescent
are calculated from incoherent typefaces. For example, ifmonospace
is selected via CSS, the result is an ugly extra space above the ascender line which unbalances the ascent/descent ratio, making text feel like it's sitting on the line bottom instead of flowing in the middle -- here it is a comparison (on the left, the current wrong behavior, on the right the correct one generated with the code fix here below):Here it is the way Gecko (Firefox) renders the same input (note the balanced ascent/descent ratio):
Source HTML: serifFallback.html.txt
The nuisance can be easily fixed with a little code change: