Closed kreier closed 5 months ago
Here is reference of Khmer Unicode scripts ISO 15924 Khmer Unicode
Some of software required to define each characters in binaries and decode to output. using REGX. But I am not good at NLP.
I can understand why this seems like an issue with the font, because you have tried a different font and it works. However, Dan Hong's Khmer font is constructed differently and does not use mark attachment; and a problem which occurs when using a font with an open source PDF library - and nobody has reported it in general - is almost always caused by the PDF library.
I believe that the problem here is that FPDF is not correctly accounting for the advance width of mark attached glyphs. I would be looking carefully at the implementation of get_string_width
.
Closing as this is now fixed in the PDF library.
Defect Report
I use NotoSansKhmer and uharfbuzz together with fpdf2 to create a pdf document. To right align the text I need the width of a string after being adjusted by the font shaping engine for combined characters. With all Noto fonts (Sans, Serif, SansUI) I get for some glyphs values that are too small. Therefore the right alignment is shifted by several pixel. A demonstration is attached below. Switching to Google font Khmer-Regular.ttf solves the problem.
Title
Some combination glyphs in Khmer return inconsistent width values
Font
Where the font came from, and when
Site: https://notofonts.github.io/khmer/ Site: https://notofonts.github.io/khmer/fonts/NotoSansKhmer/googlefonts/ttf/NotoSansKhmer-Regular.ttf Date: 2024-06-06
Font Version
2.004
OS name and version
Windows 11 Pro 23H2
Application name and version
fpdf2 2.7.9 with uharfbuzz 0.39.1
Issue
As written in the introduction, the returned width of a string after font shaping with harfbuzz is too small for several combinations with Noto fonts. The python program will produce the observed results for a variety of fonts, but is consistently flawed for Noto fonts while the Google font solves the problem.
I draw a box with the width of the drawn string as returned as
pdf.string_width()
function. See the comparison in screenshots below. The python program is:Character data
One example is years: ឆ្នាំ or U+1786, U+17D2, U+1789, U+17B6, U+17C6. It is the first string in my example above.
Screenshot
This is the result of all Noto fonts (glyphs are slightly different, of course) but the box is consistently too small:
Problem solved with using another font: https://fonts.google.com/specimen/Khmer
Tools for reporting bugs
Harfbuzz hb-view and hb-shape
These are part of the HarfBuzz distribution and can help isolate if an issue is in the app/OS, shaping engine, or font.
For example:
Fontview
Fontdiff