libnxz / power-gzip

POWER NX zlib compliant library
23 stars 18 forks source link

Replace assembly-based crc32 with builtin-based implementation #146

Closed mscastanho closed 2 years ago

mscastanho commented 2 years ago

This is based on top of #144, so the first two commits are from there. I'll rebase once it gets merged.

I ran some performance tests with random data from /dev/random to help decide if we should use the vector implementation for all lenghts (#29). Turns out the generic implementation is indeed faster up to length 31 (first plot), because for those lenghts the vector implementation relies on successive table lookups to calculate the checksum, no vectors involved.

image

So I decided to get the best of both worlds and only apply the vector version for length larger than 32:

image