kas-gui / kas-text

Rich text processing
Apache License 2.0
58 stars 2 forks source link

Implement shaping using Rustybuzz 0.3.0 #40

Closed dhardy closed 3 years ago

dhardy commented 3 years ago

Proof of concept. Seems to work identically to the "simple shaper" on my Arabic sample — Rustybuzz doesn't support that language yet?

dhardy commented 3 years ago

These are screenshots from the KAS layout example, respetively with no shaping, HarfBuzz shaping, and Rustybuzz shaping:

no-shaping shaping-harfbuzz shaping-rustybuzz

Arabic text is from this sample.

It appears that Rustybuzz is not doing any shaping here. @RazrFalcon you claim rustybuzz has 98% compliance with the Harfbuzz test suite, hence I assume Arabic shaping is included? Any idea why it isn't being used here?

The Rustybuzz driver code is close to a copy+paste of the Harfbuzz code, and is being used.

No "features" are enabled for either HarfBuzz or Rustybuzz; I think they aren't needed? Harfbuzz enables several by default.

RazrFalcon commented 3 years ago

Rustybuzz passes Harfbuzz's test suite which has like 1500 tests, so I'm pretty sure It's identical.

I need a minimal reproduceable example. Can you run your text + font through hb-shape and cargo run --example shape (from rustybuzz) utils? Like this: hb-shape /usr/share/fonts/corefonts/arial.ttf 'text'

dhardy commented 3 years ago

This is just the first word. And yes, it appears that font-kit happens to be selecting the font supplied with Blender.

$ cargo run --example shape /usr/share/fonts/blender/droidsans.ttf عندما
    Finished dev [unoptimized + debuginfo] target(s) in 0.01s
     Running `/home/dhardy/.cache/cargo/debug/examples/shape /usr/share/fonts/blender/droidsans.ttf 'عندما'`
uniFE8D=4+530|uniFEE1=3+1360|uniFEA9=2+1001|uniFEE5=1+1376|uniFEC9=0+1106
$ hb-shape /usr/share/fonts/blender/droidsans.ttf عندما
[uniFE8E=4+616|uniFEE3=3+1237|uniFEAA=2+1124|uniFEE8=1+698|uniFECB=0+1079]

Edit: longer samples

$ hb-shape /usr/share/fonts/blender/droidsans.ttf "عندما يريد العالم أن يتكلّم  ، فهو يتحدّث بلغة يونيكود. تسجّل الآن لحضور المؤتمر الدولي العاشر ليونيكود"
[uniFEA9=104+1001|uniFEEE=103+1112|uniFEDC=102+1159|uniFEF4=101+698|uniFEE7=100+616|uniFEEE=99+1112|uniFEF4=98+698|uniFEDF=97+616|space=96+651|uniFEAE=95+723|uniFEB7=94+1679|uniFE8E=93+616|uniFECC=92+1217|uniFEDF=91+616|uniFE8D=90+530|space=89+651|uniFEF2=88+2075|uniFEDF=87+616|uniFEED=86+1128|uniFEAA=85+1124|uniFEDF=84+616|uniFE8D=83+530|space=82+651|uniFEAE=81+723|uniFEE4=80+1296|uniFE97=79+616|uniFE86=78+1112|uniFEE4=77+1296|uniFEDF=76+616|uniFE8D=75+530|space=74+651|uniFEAD=73+682|uniFEEE=72+1112|uniFEC0=71+1665|uniFEA4=70+1176|uniFEDF=69+616|space=68+651|uniFEE5=67+1376|uniFEF5=65+1370|uniFE8D=64+530|space=63+651|uniFEDE=62+1417|afii57457=60+0|uniFEA0=60+1176|uniFEB4=59+1720|uniFE97=58+616|space=57+651|period=56+651|uniFEA9=55+1001|uniFEEE=54+1112|uniFEDC=53+1159|uniFEF4=52+698|uniFEE7=51+616|uniFEEE=50+1112|uniFEF3=49+616|space=48+651|uniFE94=47+1206|uniFED0=46+1217|uniFEE0=45+698|uniFE91=44+616|space=43+651|uniFE99=42+1593|afii57457=40+0|uniFEAA=40+1124|uniFEA4=39+1176|uniFE98=38+698|uniFEF3=37+616|space=36+651|uniFEEE=35+1112|uniFEEC=34+1671|uniFED3=33+1217|space=32+651|afii57388=31+661|space=30+651|space=29+0|space=28+651|uniFEE2=27+1417|afii57457=25+0|uniFEE0=25+698|uniFEDC=24+1159|uniFE98=23+698|uniFEF3=22+616|space=21+0|space=20+651|uniFEE5=19+1376|uniFE83=18+530|space=17+651|uniFEE2=16+1417|uniFEDF=15+616|uniFE8E=14+616|uniFECC=13+1217|uniFEDF=12+616|uniFE8D=11+530|space=10+651|uniFEAA=9+1124|uniFEF3=8+616|uniFEAE=7+723|uniFEF3=6+616|space=5+651|uniFE8E=4+616|uniFEE3=3+1237|uniFEAA=2+1124|uniFEE8=1+698|uniFECB=0+1079]
$ cargo run --example shape /usr/share/fonts/blender/droidsans.ttf "عندما يريد العالم أن يتكلّم  ، فهو يتحدّث بلغة يونيكود. تسجّل الآن لحضور المؤتمر الدولي العاشر ليونيكود"
    Finished dev [unoptimized + debuginfo] target(s) in 0.00s
     Running `/home/dhardy/.cache/cargo/debug/examples/shape /usr/share/fonts/blender/droidsans.ttf 'عندما يريد العالم أن يتكلّم  ، فهو يتحدّث بلغة يونيكود. تسجّل الآن لحضور المؤتمر الدولي العاشر ليونيكود'`
uniFEA9=104+1001|uniFEED=103+1128|uniFED9=102+1593|uniFEF1=101+2116|uniFEE5=100+1376|uniFEED=99+1128|uniFEF1=98+2116|uniFEDD=97+1376|space=96+651|uniFEAD=95+682|uniFEB5=94+2425|uniFE8D=93+530|uniFEC9=92+1106|uniFEDD=91+1376|uniFE8D=90+530|space=89+651|uniFEF1=88+2116|uniFEDD=87+1376|uniFEED=86+1128|uniFEA9=85+1001|uniFEDD=84+1376|uniFE8D=83+530|space=82+651|uniFEAD=81+682|uniFEE1=80+1360|uniFE95=79+1593|uniFE85=78+1128|uniFEE1=77+1360|uniFEDD=76+1376|uniFE8D=75+530|space=74+651|uniFEAD=73+682|uniFEED=72+1128|uniFEBD=71+2413|uniFEA1=70+1178|uniFEDD=69+1376|space=68+651|uniFEE5=67+1376|uniFE81=66+530|uniFEDD=65+1376|uniFE8D=64+530|space=63+651|uniFEDD=62+1376|afii57457=60+0|uniFE9D=60+1178|uniFEB1=59+2425|uniFE95=58+1593|space=57+651|period=56+651|uniFEA9=55+1001|uniFEED=54+1128|uniFED9=53+1593|uniFEF1=52+2116|uniFEE5=51+1376|uniFEED=50+1128|uniFEF1=49+2116|space=48+651|uniFE93=47+1182|uniFECD=46+1106|uniFEDD=45+1376|uniFE8F=44+1593|space=43+651|uniFE99=42+1593|afii57457=40+0|uniFEA9=40+1001|uniFEA1=39+1178|uniFE95=38+1593|uniFEF1=37+2116|space=36+651|uniFEED=35+1128|afii57470=34+1182|uniFED1=33+2073|space=32+651|afii57388=31+661|space=30+651|space=29+0|space=28+651|uniFEE1=27+1360|afii57457=25+0|uniFEDD=25+1376|uniFED9=24+1593|uniFE95=23+1593|uniFEF1=22+2116|space=21+0|space=20+651|uniFEE5=19+1376|uniFE83=18+530|space=17+651|uniFEE1=16+1360|uniFEDD=15+1376|uniFE8D=14+530|uniFEC9=13+1106|uniFEDD=12+1376|uniFE8D=11+530|space=10+651|uniFEA9=9+1001|uniFEF1=8+2116|uniFEAD=7+682|uniFEF1=6+2116|space=5+651|uniFE8D=4+530|uniFEE1=3+1360|uniFEA9=2+1001|uniFEE5=1+1376|uniFEC9=0+1106
RazrFalcon commented 3 years ago

Can you also include this exact font? I will take a look.

dhardy commented 3 years ago

blender-droidsans.ttf.gz

dhardy commented 3 years ago

Sorry, you might need to rename that to _.tar.gz.

RazrFalcon commented 3 years ago

I've finally found time to investigate this and looks like you've hit the unsupported 2%. More specifically, you hit the Arabic fallback shaping feature, which is not implemented and probably would never will be, because it basically requires creating/synthesizing a temporary Arabic font in memory.

Sadly, I'm not sure why harfbuzz does this, but I presume this is because the particular font doesn't have Arabic shaping data. So you should simply use an another one.

HarfBuzz has a single test case for this feature, so I'm not sure how common it is.

In general, rustybuzz does support Arabic shaping, as long as you are using the correct font. Like Amiri.

dhardy commented 3 years ago

Interesting answer. Thanks for investigating.