python-pillow / Pillow

Python Imaging Library (Fork)
https://python-pillow.org
Other
12.19k stars 2.22k forks source link

Unicode rendering error in ImageDraw module #597

Closed ensv closed 5 years ago

ensv commented 10 years ago

I tried to convert unicode text into image, but the rendering in unicode is not correct. Here is an example, I want to convert this text "ស្នាក់​" ( real sequence of characters : ស + ្ ​+ ន​ + ក​ + ់​)​. I expected to get "ស្នាក់​" in the image, but I got the real sequence instead ("ស្​នាក់​"). Any ideas on this issue ?

In case you want to test, you can use this font : http://sourceforge.net/projects/khmer/files/Fonts%20-%20KhmerOS/KhmerOS%20Fonts%205.0-%20LGPL%20Licence/All_KhmerOS_5.0.zip/download?use_mirror=skylink&download=

this is the image I get : test

SV

aclark4life commented 10 years ago

My guess is that there are many unicode issues

wiredfool commented 10 years ago

Can you write a python based test case? I may have a chance at tracing it down that way. Especially if you can confirm that the font renders correctly in other applications on the same platform (e.g. libre office).

Ideally, having a correct/incorrect bitmap, the correct unicode string going in, and if you can tell what it is, what is actually being rendered.

I must admit, I don't understand Khmer, nor to I totally understand how it's combining glyphs.

ensv commented 10 years ago

HI wiredfool,

This is the simple program I used to convert khmer text into image : http://pastebin.com/44fqWTY4. This font was tested on both Ubuntu and windows and was confirmed to be correctly rendered. I also attach an image for you for the good rendering. correct_img.

Actually, the problem is subscript form in Khmer. If you notice on the plus sign on the bottom ('+'), this is used to tell the rendering that the next character must be transformed into subscript (and eventually should appear on the bottom of the baseline). then, the next vowel should be attached to the previous character (before the subscript character). In other word processors on my PC, this rendering works fine as expected. Just with python Image draw module, the subscript form was not transformed, and the next character is always there.

To be more precise, I tested with python 2.7.6, latest pillow version running on both ubuntu (12.04) & windows 7.

radarhere commented 5 years ago

The combining of glyphs is handled by raqm in Pillow 4.2.0. However, there is a clipping issue after that, discussed in #3235.