mhammond / pywin32

Python for Windows (pywin32) Extensions
4.92k stars 786 forks source link

UTF-8 / ISO-8859-1 encoding using DrawText #2075

Closed wilm02 closed 1 year ago

wilm02 commented 1 year ago

Here is a Python program that outputs text in truetype font using native Windows routines. It works fine with one exception: UTF-8 / ISO-8859-1 encoding. I get the truetype output: "(~A)(1/4) = u-umla" instead of the expected "(:u) = u-umlaut". The beginning of the output is the typical to an UTF-8 text opened with ISO-8859-1 coding. The end of the output is truncated, because the width seems to be calculated correctly. This program runs under Windows 10 64 bit with python 3.10.5 and pywin32-306.

Doing the same in C, using the string "(:u) = u-umlaut" directly in source, the truetype output depends on the coding of the source file. With an ISO-8859-1 source file I get the expected truetype output: "(:u) = u-umlaut". With an UTF-8 source file I get the truetype output: "(~A)(1/4) = u-umlaut". But in Python UTF-8 is fix for source files.

How can I solve this problem?

import win32con, win32gui, win32ui                   # <= python -m pip install pywin32
import ctypes
import struct
import PIL.Image

def ttf(text, font, height, weight=win32con.FW_NORMAL, italic=False, underline=False):
    #__init__
    font = win32ui.CreateFont({'name': font, 'height': height, 'weight': weight, 'italic': italic, 'underline': underline})
    desktopHwnd = win32gui.GetDesktopWindow()
    desktopDC = win32gui.GetWindowDC(desktopHwnd)
    mfcDC = win32ui.CreateDCFromHandle(desktopDC)
    drawDC = mfcDC.CreateCompatibleDC()
    drawDC.SelectObject(font)
    #renderText
    drawDC.SetTextColor(0)
    w,h = drawDC.GetTextExtent(text)
    saveBitMap = win32ui.CreateBitmap()
    saveBitMap.CreateCompatibleBitmap(mfcDC, w, h)        
    drawDC.SelectObject(saveBitMap)
    drawDC.DrawText(text, (0, 0, w, h), win32con.DT_LEFT)
    #native_bmp_to_pil
    bmpheader = struct.pack("LHHHH", struct.calcsize("LHHHH"), w, h, 1, 24)
    c_bmpheader = ctypes.c_buffer(bmpheader)
    c_bits = ctypes.c_buffer(b" " * (h * ((w*3 + 3) & -4)))
    res = ctypes.windll.gdi32.GetDIBits(drawDC.GetSafeHdc(), saveBitMap.GetHandle(), 0, h, c_bits, c_bmpheader, win32con.DIB_RGB_COLORS)
    if not res:
        raise IOError("GetDIBits failed")
    pil_im = PIL.Image.frombuffer("RGB", (w, h), c_bits, "raw", "BGR", (w*3 + 3) & -4, -1)
    #__del__
    win32gui.DeleteObject(saveBitMap.GetHandle())
    mfcDC.DeleteDC()
    drawDC.DeleteDC()
    win32gui.ReleaseDC(desktopHwnd, desktopDC)
    win32gui.DeleteObject(font.GetSafeHandle())
    return pil_im

if __name__ == "__main__":
    text =     'ü = u-umlaut'                        # usually directly as a string
    text = b'\xfc = u-umlaut'.decode("ISO-8859-1")   # to make it clearer, here's binary
    ttf(text, 'Segoe UI', 80).show()                 # show the output in truetype font
    print(text)                                      # here's what it should look like
wilm02 commented 1 year ago

Meanwhile I found a solution by myself. Exchanging the line drawDC.DrawText(text, (0, 0, w, h), win32con.DT_LEFT) with ctypes.windll.gdi32.TextOutW (drawDC.GetSafeHdc(), 0,0, text, len (text)) will solve the problem with foreign characters.

Perhaps there might be a more elegant way.