RazrFalcon / rustybuzz

A complete harfbuzz's shaping algorithm port to Rust
MIT License
502 stars 34 forks source link

Try to port set_digest #121

Closed LaurenzV closed 6 days ago

LaurenzV commented 1 week ago

Let's see how it goes.

LaurenzV commented 1 week ago

@behdad In hb_set_digest_combinter_t, is

bool add_range (hb_codepoint_t a, hb_codepoint_t b)
  {
    return (int) head.add_range (a, b) | (int) tail.add_range (a, b);
  }

just

bool add_range (hb_codepoint_t a, hb_codepoint_t b)
  {
    return head.add_range || tail.add_range (a, b);
  }

or am I missing something?

EDIT: Nevermind, the second one would short-circuit while the first one would not, I think.

LaurenzV commented 1 week ago

@RazrFalcon So, turns out we actually have something similar to digest set already, I just didn't notice because it was named differently. :p But it was only used in two locations. Anyway, I replaced it with the harfbuzz set digest implementation and I'm seeing some nice speedups already! I'll see whether I can find the remaining places where it's used in harfbuzz as well.

One thing to note is that I'm not sure I'll be able to replicate the harfbuzz structure 1:1, because in harfbuzz there is this accelerator wrapper struct as well as Proxy, and I'm not sure if it will be easy to port.

Test Name Before After
arabic::paragraph_long::rb 392520.8 338027.15
arabic::paragraph_medium::rb 231291.67 190327.6
arabic::paragraph_short::rb 77015.17 64271.48
arabic::sentence_1::rb 54922.92 43420.13
arabic::sentence_2::rb 55653.17 45505.09
arabic::word_1::rb 16283.12 10325.39
arabic::word_2::rb 17411.95 11588.63
arabic::word_3::rb 16523.29 10618.7
english::aat_paragraph_long::rb 497358.4 504909.64
english::aat_sentence_1::rb 58580.37 58986.61
english::aat_word_1::rb 4393.1 4524.62
english::long_zalgo::rb 1870320.9 1720540.6
english::paragraph_long::rb 456722.9 378100
english::paragraph_long_mono::rb 79397.35 79218.18
english::paragraph_medium::rb 331425 268264.41
english::paragraph_short::rb 211846.53 164255.83
english::sentence_1::rb 125274.98 92912.97
english::sentence_2::rb 115765.27 85812.92
english::sentence_mono::rb 12664.62 12504.7
english::short_zalgo::rb 152888.46 119047.93
english::variations::rb 582909.11 418908.35
english::variations_default::rb 538897.93 417473.97
english::word_1::rb 74295.13 48810.42
english::word_2::rb 96590.96 50727.23
english::word_3::rb 74267.09 48499.26
english::word_4::rb 76657.63 50706.13
hebrew::paragraph_long_1::rb 302007.55 223193.03
hebrew::paragraph_long_2::rb 449438.54 350977.1
hebrew::paragraph_medium::rb 76569.64 59700.3
hebrew::sentence_1::rb 31902.71 24542.92
hebrew::sentence_2::rb 25034.39 18634.87
hebrew::word_1::rb 9316.79 6817.56
hebrew::word_2::rb 9340.54 6853.35
hindi::aat_paragraph_long::rb 1408535.94 1340325.1
hindi::aat_sentence::rb 203805.53 174498.98
hindi::aat_word::rb 47851.74 23387.5
hindi::paragraph_long::rb 1018933.32 874204.2
hindi::paragraph_medium::rb 645542.71 550805.22
hindi::paragraph_short::rb 369729.17 302540.62
hindi::sentence_1::rb 169481.64 135329.17
hindi::sentence_2::rb 162120.49 120293.07
hindi::word_1::rb 68044.56 48514.48
hindi::word_2::rb 67839.84 49835.55
khmer::aat_paragraph_long_1::rb 777954.2 729962.6
khmer::aat_sentence_1::rb 78639.4 68349.99
khmer::aat_word_1::rb 20644.61 13328.75
khmer::paragraph_long_1::rb 753512.5 619012.4
khmer::paragraph_long_2::rb 902437.52 754854.1
khmer::paragraph_medium::rb 433292.18 353707.28
khmer::sentence_1::rb 58932.91 42132.06
khmer::sentence_2::rb 59866.52 46493.28
khmer::word_1::rb 25951.28 19585.42
khmer::word_2::rb 29661.01 22463.33
khmer::word_3::rb 26616.51 20344.21
myanmar::aat_paragraph_long::rb 2488008.33 2493779.15
myanmar::aat_sentence_1::rb 100067.26 95012.7
myanmar::aat_word_1::rb 11847.01 9663.62
myanmar::paragraph_long::rb 3104907.3 2924845.8
myanmar::paragraph_medium::rb 1182590.62 1105495.85
myanmar::paragraph_short::rb 257289.93 232050
myanmar::sentence_1::rb 259715.62 236104.17
myanmar::sentence_2::rb 348687.5 324939.6
myanmar::word_1::rb 25923.36 18480.59
myanmar::word_2::rb 26585 18805.55
thai::paragraph_long::rb 406404.17 369447.66
thai::paragraph_medium::rb 131565.62 114900.68
thai::paragraph_short::rb 52665.87 45863.39
thai::sentence_1::rb 41688.02 35667.65
thai::word_1::rb 11326.98 8625.52
thai::word_2::rb 12161.12 9092.81
RazrFalcon commented 6 days ago

Great!

So, turns out we actually have something similar to digest set already, I just didn't notice because it was named differently.

Yes, looks like laurmaedje decided to implement it that way. The whole GSUB/GPOS/OpenType layout was written by him. Eventually I moved tables parsing from RB to ttf-parser, to make RB parsing-free, but everything else was left as is.

behdad commented 6 days ago

This looks great!

These are places that HB uses the set-digest:

[0] Available from https://github.com/harfbuzz/harfbuzz-monster-fonts

LaurenzV commented 6 days ago

Thanks!

behdad commented 6 days ago

With the buffer digest, note that you need to:

LaurenzV commented 6 days ago

Yep, no worries, those are already included. 😄

LaurenzV commented 6 days ago

Alright, AAT and GDEF are a bit trickier unfortunately because from what I can tell ttf-parser doesn't always expose the necessary information on the glyph coverage, so I guess let's leave it at that for now. I've added TODOs in the appropriate places.

We still have some nice overall performance improvements!

RazrFalcon commented 6 days ago

Thanks!