I noticed that a few simple things could be done to improve hashing performance:
CRC32 zero extension can be computed using GF arithmetic instead of passing 0s through. Doing so greatly speeds up the window table generation
the CRC window mask can be incorporated into the window table itself, saving the mask operation. Speed difference is likely negligible, but it simplifies code
avoid unnecessarily copying blocks in MD5 on little endian CPUs
slightly faster MD5 round expressions
(I don't really see par2cmdline being highly performance focused, but these are relatively simple tweaks which I thought would make sense to have)
I noticed that a few simple things could be done to improve hashing performance:
(I don't really see par2cmdline being highly performance focused, but these are relatively simple tweaks which I thought would make sense to have)