Closed laurmaedje closed 1 year ago
This code is autogenerated in harfbuzz, aka a state machine. Will see what we can do.
If the Rust codes were replicated exactly from the C codes, I guess that out-of-bound read also happens in harfbuzz 2.7.1, in https://github.com/harfbuzz/harfbuzz/blob/2.7.1/src/hb-ot-shape-complex-use-machine.hh#L432, info[p].use_category()
is expanded to info[p].var2.u8[2]
(use_category
is a macro), without any pointer being dereferenced, so it may not cause any serious bug yet and nobody noticed it.
Note it was rewritten by PR https://github.com/harfbuzz/harfbuzz/pull/1695 in harfbuzz 7.0.1 and backported to 2.7.3.
When shaping the text below with Noto Sans Meetei Mayek Regular, rustybuzz panics with an out of bounds error in the universal shaping engine in line 370. It's possible to get rid of the panic by changing the comparison in line 478 from
!=
to<
. The problem is thatp
skips directly beyondpe
so the break doesn't trigger. I tried to look what Harfbuzz does there, but it's quite hard to follow, so not sure whether this would be the right fix.
This sounds like https://blog.cloudflare.com/incident-report-on-memory-leak-caused-by-cloudflare-parser-bug/ :)
@behdad *_machine
code is the worst. I had to port autogenerated C code with tons of goto's to Rust. No surprises here.
Luckily, hb test suite is extensive enough and there were nearly zero bugreports in the past two years.
@thynson Thanks. That's interesting. There are no pointer manipulations in rustybuzz, so any hidden hb/C bugs/UB will become visible.
And rustybuzz is still at 2.7.1 I have no time lately to sync it up.
A simpler example here is anything involving the Balinese character ᭄ (U+1B44). Rustybuzz panics in the same line when shaping with Noto Serif Balinese.
This may be fixed by #77
When shaping the text below with Noto Sans Meetei Mayek Regular, rustybuzz panics with an out of bounds error in the universal shaping engine in line 370. It's possible to get rid of the panic by changing the comparison in line 478 from
!=
to<
. The problem is thatp
skips directly beyondpe
so the break doesn't trigger. I tried to look what Harfbuzz does there, but it's quite hard to follow, so not sure whether this would be the right fix.The code below reproduces the problem:
(The text simply consists of all codepoints in
U+AAE6 ..= U+AAF6
.)