Closed LaurenzV closed 6 days ago
@behdad In hb_set_digest_combinter_t
, is
bool add_range (hb_codepoint_t a, hb_codepoint_t b)
{
return (int) head.add_range (a, b) | (int) tail.add_range (a, b);
}
just
bool add_range (hb_codepoint_t a, hb_codepoint_t b)
{
return head.add_range || tail.add_range (a, b);
}
or am I missing something?
EDIT: Nevermind, the second one would short-circuit while the first one would not, I think.
@RazrFalcon So, turns out we actually have something similar to digest set already, I just didn't notice because it was named differently. :p But it was only used in two locations. Anyway, I replaced it with the harfbuzz set digest implementation and I'm seeing some nice speedups already! I'll see whether I can find the remaining places where it's used in harfbuzz as well.
One thing to note is that I'm not sure I'll be able to replicate the harfbuzz structure 1:1, because in harfbuzz there is this accelerator
wrapper struct as well as Proxy
, and I'm not sure if it will be easy to port.
Test Name | Before | After |
---|---|---|
arabic::paragraph_long::rb | 392520.8 | 338027.15 |
arabic::paragraph_medium::rb | 231291.67 | 190327.6 |
arabic::paragraph_short::rb | 77015.17 | 64271.48 |
arabic::sentence_1::rb | 54922.92 | 43420.13 |
arabic::sentence_2::rb | 55653.17 | 45505.09 |
arabic::word_1::rb | 16283.12 | 10325.39 |
arabic::word_2::rb | 17411.95 | 11588.63 |
arabic::word_3::rb | 16523.29 | 10618.7 |
english::aat_paragraph_long::rb | 497358.4 | 504909.64 |
english::aat_sentence_1::rb | 58580.37 | 58986.61 |
english::aat_word_1::rb | 4393.1 | 4524.62 |
english::long_zalgo::rb | 1870320.9 | 1720540.6 |
english::paragraph_long::rb | 456722.9 | 378100 |
english::paragraph_long_mono::rb | 79397.35 | 79218.18 |
english::paragraph_medium::rb | 331425 | 268264.41 |
english::paragraph_short::rb | 211846.53 | 164255.83 |
english::sentence_1::rb | 125274.98 | 92912.97 |
english::sentence_2::rb | 115765.27 | 85812.92 |
english::sentence_mono::rb | 12664.62 | 12504.7 |
english::short_zalgo::rb | 152888.46 | 119047.93 |
english::variations::rb | 582909.11 | 418908.35 |
english::variations_default::rb | 538897.93 | 417473.97 |
english::word_1::rb | 74295.13 | 48810.42 |
english::word_2::rb | 96590.96 | 50727.23 |
english::word_3::rb | 74267.09 | 48499.26 |
english::word_4::rb | 76657.63 | 50706.13 |
hebrew::paragraph_long_1::rb | 302007.55 | 223193.03 |
hebrew::paragraph_long_2::rb | 449438.54 | 350977.1 |
hebrew::paragraph_medium::rb | 76569.64 | 59700.3 |
hebrew::sentence_1::rb | 31902.71 | 24542.92 |
hebrew::sentence_2::rb | 25034.39 | 18634.87 |
hebrew::word_1::rb | 9316.79 | 6817.56 |
hebrew::word_2::rb | 9340.54 | 6853.35 |
hindi::aat_paragraph_long::rb | 1408535.94 | 1340325.1 |
hindi::aat_sentence::rb | 203805.53 | 174498.98 |
hindi::aat_word::rb | 47851.74 | 23387.5 |
hindi::paragraph_long::rb | 1018933.32 | 874204.2 |
hindi::paragraph_medium::rb | 645542.71 | 550805.22 |
hindi::paragraph_short::rb | 369729.17 | 302540.62 |
hindi::sentence_1::rb | 169481.64 | 135329.17 |
hindi::sentence_2::rb | 162120.49 | 120293.07 |
hindi::word_1::rb | 68044.56 | 48514.48 |
hindi::word_2::rb | 67839.84 | 49835.55 |
khmer::aat_paragraph_long_1::rb | 777954.2 | 729962.6 |
khmer::aat_sentence_1::rb | 78639.4 | 68349.99 |
khmer::aat_word_1::rb | 20644.61 | 13328.75 |
khmer::paragraph_long_1::rb | 753512.5 | 619012.4 |
khmer::paragraph_long_2::rb | 902437.52 | 754854.1 |
khmer::paragraph_medium::rb | 433292.18 | 353707.28 |
khmer::sentence_1::rb | 58932.91 | 42132.06 |
khmer::sentence_2::rb | 59866.52 | 46493.28 |
khmer::word_1::rb | 25951.28 | 19585.42 |
khmer::word_2::rb | 29661.01 | 22463.33 |
khmer::word_3::rb | 26616.51 | 20344.21 |
myanmar::aat_paragraph_long::rb | 2488008.33 | 2493779.15 |
myanmar::aat_sentence_1::rb | 100067.26 | 95012.7 |
myanmar::aat_word_1::rb | 11847.01 | 9663.62 |
myanmar::paragraph_long::rb | 3104907.3 | 2924845.8 |
myanmar::paragraph_medium::rb | 1182590.62 | 1105495.85 |
myanmar::paragraph_short::rb | 257289.93 | 232050 |
myanmar::sentence_1::rb | 259715.62 | 236104.17 |
myanmar::sentence_2::rb | 348687.5 | 324939.6 |
myanmar::word_1::rb | 25923.36 | 18480.59 |
myanmar::word_2::rb | 26585 | 18805.55 |
thai::paragraph_long::rb | 406404.17 | 369447.66 |
thai::paragraph_medium::rb | 131565.62 | 114900.68 |
thai::paragraph_short::rb | 52665.87 | 45863.39 |
thai::sentence_1::rb | 41688.02 | 35667.65 |
thai::word_1::rb | 11326.98 | 8625.52 |
thai::word_2::rb | 12161.12 | 9092.81 |
Great!
So, turns out we actually have something similar to digest set already, I just didn't notice because it was named differently.
Yes, looks like laurmaedje decided to implement it that way. The whole GSUB/GPOS/OpenType layout was written by him.
Eventually I moved tables parsing from RB to ttf-parser
, to make RB parsing-free, but everything else was left as is.
This looks great!
These are places that HB uses the set-digest:
[0] Available from https://github.com/harfbuzz/harfbuzz-monster-fonts
Thanks!
With the buffer digest, note that you need to:
Recalculate the digest after shaper functions that modify the buffer, https://github.com/harfbuzz/harfbuzz/blob/e079dd20304bacd8d753eee2ceacbaa1a3978a34/src/hb-ot-layout.cc#L2050
Add new glyphs to the digest as they are added in GSUB. https://github.com/harfbuzz/harfbuzz/blob/e079dd20304bacd8d753eee2ceacbaa1a3978a34/src/hb-ot-layout-gsubgpos.hh#L839
Yep, no worries, those are already included. 😄
Alright, AAT and GDEF are a bit trickier unfortunately because from what I can tell ttf-parser
doesn't always expose the necessary information on the glyph coverage, so I guess let's leave it at that for now. I've added TODOs in the appropriate places.
We still have some nice overall performance improvements!
Thanks!
Let's see how it goes.