RazrFalcon / rustybuzz

A complete harfbuzz's shaping algorithm port to Rust
MIT License
544 stars 34 forks source link

Different shaping results compared to harfbuzz #63

Open floppyhammer opened 1 year ago

floppyhammer commented 1 year ago

Test text: ح ب ا حبا It consists of two parts (separated with a space): حبا ح ب ا The second part is basically the first part with every character delimited with space (in order to cancel ligatures).

Related buffer config:

buffer.set_direction(rustybuzz::Direction::RightToLeft);
buffer.set_language(rustybuzz::Language::from_str("ar").unwrap());
buffer.set_script(rustybuzz::script::ARABIC);

Test case 1 (unifont-15.0.01.ttf) Shaped glyph id: harfbuzz 56721 56725 56237 35 56720 35 56722 35 56740 rustybuzz 1578 1579 1584 35 1578 35 1579 35 1584

So firstly, I had tested with unifont, and rustybuzz failed to handle the ligatures. I thought I must had got something wrong, but then I tested with another font.

Test case 2 (HONORSansArabicUI-B.ttf) Shaped glyph id: harfbuzz 7 33 97 1 5 1 30 1 93 rustybuzz 7 33 97 1 5 1 30 1 93

And the result is correct and conforms to that of harfbuzz.

There might be something off with rustybuzz.

behdad commented 1 year ago

It is probably because Unifont doesn't have OpenType shaping rules for Arabic, and HarfBuzz synthesizes those, implementing fallback Arabic shaping, which rustybuzz currently doesn't. This is documented behavior.

behdad commented 1 year ago

From README:

floppyhammer commented 1 year ago

@behdad Thanks for the answer! I didn't realize that.

RazrFalcon commented 1 year ago

Yes, this particular feature is not implemented. I wasn't expecting it to be that common. I guess I would have to figure it out after all.

LaurenzV commented 3 months ago

Could you further explain why implementing this requires subsetting? I don't understand how a shaping feature would require implementing subsetting?

RazrFalcon commented 3 months ago

It doesn't "require" subsetting per se, but it requires part of it. At least this is how it is implemented in HB. I do not remember details, it was nearly 5 years ago, but I assume this is hb-ot-shaper-arabic-fallback.hh. And as you can see HB creates temporary tables using subsetting code and them passes them to the shaper. Meaning we have to implement this part of subsetting as well.

behdad commented 3 months ago

It needs the code to serialize SingleSubst and LigatureSubst tables. Alternatively you can implement the logic in code...

LaurenzV commented 3 months ago

Good to know, thanks! I guess it should be doable...