This is based on top of #144, so the first two commits are from there. I'll rebase once it gets merged.
I ran some performance tests with random data from /dev/random to help decide if we should use the vector implementation for all lenghts (#29). Turns out the generic implementation is indeed faster up to length 31 (first plot), because for those lenghts the vector implementation relies on successive table lookups to calculate the checksum, no vectors involved.
So I decided to get the best of both worlds and only apply the vector version for length larger than 32:
This is based on top of #144, so the first two commits are from there. I'll rebase once it gets merged.
I ran some performance tests with random data from
/dev/random
to help decide if we should use the vector implementation for all lenghts (#29). Turns out the generic implementation is indeed faster up to length 31 (first plot), because for those lenghts the vector implementation relies on successive table lookups to calculate the checksum, no vectors involved.So I decided to get the best of both worlds and only apply the vector version for length larger than 32: