mpcabd / python-arabic-reshaper

Reconstruct Arabic sentences to be used in applications that don't support Arabic
MIT License
398 stars 80 forks source link

fix for ZWJ #16

Closed leomoon closed 6 years ago

leomoon commented 6 years ago

Zero-width Joiner support so you can write things like [ ه‍ ]

mpcabd commented 6 years ago

Hello

Can you check 75df1ed1 and run it against your use cases and tell me if it behaves as expected?

leomoon commented 6 years ago

I found some conflicts.

It doesn't work on middle and end forms. Think of ZWJ like this. It only has one form, and every letter that can connect, it will connect to it. After the reshaping is calculated, the ZWJ can be deleted.

This is the test case I used: [ ئ‍ ‍ئ‍ ‍ئ ئ - ی‍ ‍ی‍ ‍ی ی - ‍د د - ‍ا ا ]

I also wanted to share my test case file. I made this to test for shaping, harekats, and bidi. It might be useful to you: https://www.dropbox.com/s/5eld6s4kepiux4z/testCases.txt

mpcabd commented 6 years ago

I've tested all my use cases in browsers, and I think the current implementation matches what browsers do, you can use this https://codepen.io/mpcabd/full/MOQeNQ/ to test and play a bit.

leomoon commented 6 years ago

I know how to replicate using your link. :D

I also made a video explaining this. It might be easier to just watch the video.

The reason that you can't see the middle form is because of the font that is being used. The left/right connectors of the default font you have, are so tiny that they look like INITIAL form.

If you try [ب]+[ب]+[ب], the middle [ب] "looks" like the INITIAL (connector is so small), but it should look like MEDIAL.

First add font-family "tahoma" or "courier" to body of your css: font-family: tahoma;

Try [ZWJ]+[ب]: This should give the FINAL form not ISOLATED. And try [ZWJ]+[ب]+[ZWJ]: This should give the MEDIAL form not INITIAL.

I can explain more if you need. I think I'm not explaining well.

mpcabd commented 6 years ago

Hello,

I saw your video, and updated the codepen, and indeed the font was playing games with my eyes, I've fixed that, please check the latest release it now matches the codepen after I changed the font.

leomoon commented 6 years ago

PERFECT.