llogiq / bytecount

Counting occurrences of a given byte or UTF-8 characters in a slice of memory – fast
Apache License 2.0
225 stars 26 forks source link

improve num_chars perf #30

Closed llogiq closed 6 years ago

llogiq commented 6 years ago

This is a very small improvement, but nonetheless I think it's worth it. The improvement comes in two steps:

  1. inverting the count logic so we can count from zero and no longer need to subtract slice lengths
  2. for the non-simd case, I use simple shifts and and/or operations, instead of the rather complex equality check.

In my benchmarks I see a consistent improvement in the non-simd case for all char counts we win against naive and a minor improvement in some cases for the simd and avx cases. Even if it amounts to almost nothing, it will likely reduce the code by a few bytes, which may help with inlining and caching.

llogiq commented 6 years ago

Next up we can look into using specialized instructions for simd/avx.