Closed indutny closed 1 year ago
With the help of https://github.com/unicode-rs/unicode-segmentation/pull/113 this gives:
Full Report: https://gist.github.com/indutny/9f5f623c52d6e935225edfe09857ef99
Rebased over the latest master, but it looks like nothing changed in this PR.
Let me run benchmarks on this first, it looks like I might have regressed with my latest refactor.
Alright, let me know when I should look at this!
@Manishearth Should be good now, sorry for the back and forth! I made some simplifications and benchmarks looks even better than before: https://gist.github.com/indutny/c8d29a00680bfbaf19d22d02a7175c0d ( with up to 51% improvement on some word bounds benches).
Thanks!
Wow! Thank you for merging it!
Not to impose, but I was wondering when you planned to release a new version of the library?
I'd happily merge a PR bumping the library to 1.10.1.
You got it!
Wait, I have one more optimization for you. :-) (EDIT: I'm @indutny's coworker, we've been loosely collaborating and independently looking at optimization opportunities in a few of the unicode-rs crates.)
Ah, it looks like @indutny's optimization has made mine irrelevant—some texts get faster, others get slower. So it's probably not worth it.
Prior to this change table search would have to do a binary search over about 1000 entries which resulted in around 10 memory loads on average. In this commit we reduce the search space by doing a pre-lookup in a generated table to get a smaller (often zero-length) slice of the full sorted range list. On average this gives us just one entry of the range list to perform binary search on, which reduces the average number of memory loads to 2.