mpcabd / python-arabic-reshaper

Reconstruct Arabic sentences to be used in applications that don't support Arabic
MIT License
398 stars 80 forks source link

Tashkeel(Harakat) Rendering Problem #32

Closed abdallah-Nasser closed 5 years ago

abdallah-Nasser commented 5 years ago

Tashkeel overlaps with characters and with other tashkeel Tashkeel_problem Note I am using Pil to draw that text on the image

DeepInSearch commented 5 years ago

In arabic AcTiv dataset, some letters are combined with Tashkeel, and represented in a new label, how to know the right unicode? Example: `

مسيرات في الإسكندرية ضد الانقلاب وهتافات تعبّر Miim_B Siin_M Yaa_M Raa_E Alif_I Taaa_I Space Faa_B Yaa_E Space Alif_I Laam_IHamzaUnderAlif_I Siin_B Kaaf_M Nuun_M Daal_E Raa_I Yaa_B TaaaClosed_E Space Daad_B Daal_E Space Alif_I Laam_EAlif_E Nuun_B Gaaf_M Laam_EAlif_E Baa_I Space Waaw_I Haa_B Taaa_M Alif_E Faa_B Alif_E Taaa_I Space Taaa_B Ayn_M BaaChadda_M Raa_I

` BaaChadda_M is the example. How to represent BaaChadda_M in unicode and render it in pillow ?

mpcabd commented 5 years ago

This is a rendering problem, not a reshaping problem. Try different fonts, try different font-sizes, letter heights, etc.

The reshaper uses the standard Unicode symbols for all the characters and tashkeel. There are no special symbols for letters with tashkeel on them, tashkeel symbols have no width so they should be rendered over the previous character.