Closed nwellnhof closed 3 years ago
Parsing a *t sequence followed by a _t*_ sequence exhibits quadratic behavior:
*t
_t*_
$ for n in 5000 10000 20000; do python3 -c "print('*t '*$n+'_t*_ '*$n)" |time -f "$n elems: %e secs" build/src/cmark >/dev/null; done 5000 elems: 0.19 secs 10000 elems: 0.88 secs 20000 elems: 5.45 secs
*t *t *t *t _t*_ _t*_ _t*_ _t*_
*
_
openers_bottom
Parsing a
*t
sequence followed by a_t*_
sequence exhibits quadratic behavior:Sample pattern
Analysis
*
closer, the preceding_
opener is removed._
closer, no opener is found. The correspondingopeners_bottom
entry is set to preceding*
opener.*
closer, the corresponding opener is removed. This is the entry inopeners_bottom
.openers_bottom
becomes a dangling pointer._
closers will test against the dangling pointer (which is undefined behavior).openers_bottom
optimization stops working, resulting in quadratic behavior with abusive input patterns.Possible solutions
openers_bottom
when removing delimiters (somewhat expensive).openers_bottom
.