pymupdf / PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
https://pymupdf.readthedocs.io
GNU Affero General Public License v3.0
4.54k stars 447 forks source link

insert_text() not display true font correctly #3438

Closed neuvkeo closed 2 months ago

neuvkeo commented 2 months ago

Description of the bug

My script below adding Khmer Text as additional page to PDF, but the Khmer font is not display as true font correctly.

######## import fitz fileName=r'Be Your Best Executive Summary.pdf' pdf_path = 'Books/'+fileName pdf_document = fitz.open(pdf_path)

khmer_font_path = r"./Fonts/KhmerUI.ttf" pdf_document._insert_font(khmer_font_path,)

new_page = pdf_document.new_page(width=600, height=800) new_page.insert_font(fontname="CustomFont", fontfile=khmer_font_path)

khmer_text = "សួស្តី ពិភពលោក"

text_x, text_y = 100, 100 new_page.insert_text((text_x, text_y), khmer_text, fontsize=12, fontname="CustomFont")

Save the changes to the PDF document

pdf_document.save("output.pdf") pdf_document.close()

#############

Original Text: image

Output: image

How to reproduce the bug

Can run below script and replace the fileName variable.

######## import fitz fileName=r'Be Your Best Executive Summary.pdf' pdf_path = 'Books/'+fileName pdf_document = fitz.open(pdf_path)

khmer_font_path = r"./Fonts/KhmerUI.ttf" pdf_document._insert_font(khmer_font_path,)

new_page = pdf_document.new_page(width=600, height=800) new_page.insert_font(fontname="CustomFont", fontfile=khmer_font_path)

khmer_text = "សួស្តី ពិភពលោក"

text_x, text_y = 100, 100 new_page.insert_text((text_x, text_y), khmer_text, fontsize=12, fontname="CustomFont")

Save the changes to the PDF document

pdf_document.save("output.pdf") pdf_document.close()

#############

PyMuPDF version

1.24.2

Operating system

Windows

Python version

3.12

JorjMcKie commented 2 months ago

Text with this font cannot be correctly written character-by-character as this is done when using .insert_text()/ .insert_textbox() or .fill_textbox(). You must use an output method that is capable of text shaping like using the Story class or - in your case probably most practical - insert_htmlbox.

neuvkeo commented 2 months ago

Thanks a lot Mr.JorjMcKei.

It's working fine with the insert_htmlbox.

BR, Neuv