JayDDee / cpuminer-opt

Optimized multi algo CPU miner
Other
774 stars 545 forks source link

Skein midstate isn't faster. #239

Closed JayDDee closed 4 years ago

JayDDee commented 4 years ago

Skein and skein2 aren't any faster prehashing to midstate.

JayDDee commented 4 years ago

The original skein code "update" phase would process a full buffer only to make room for more data. The last full block wouldn't be processed until the "close" phase.

Changes were made to process any full buffer in the update phase. These changes seemed to work but they broke some algos (#238) that didn't use midstate optimization.

This applies to AVX2 and AVX512.

JayDDee commented 4 years ago

Skein512 will be the first hash function to use a new approach. The current code is divided into 3 phases: init, update, and close, which are called in sequence. Update may be called repeatedly to add data in chunks.

The problem with this structure is that update will leave a full 64 byte buffer uncompressed for the close function to compress. For midstate prehash to be usefull all full blocks need to be compressed in the update phase, with the close phase only compressing the last partial block and the final empty block.

This is changing to specialized functions taylored to actual mining requirements and to support midstate prehash optimization. Mining requirements are:

For 512 bit hash, data sizes of 64, 80 and 128 bytes. For 256 bit hash, data sizes of 32, 64, 80, & 128 bytes.. In reality all sizes that are multiples of 16 bytes will work.

The new functions are:

The full function already exists and is being deployed gradually to all applicable algos. In practice it will not be used with 80 byte inout because prehash is faster.

Prehash64 and final16 need to be writtten.

Algos that can use skein midstate prehash are: skein, skein2, skunk and permuted chained algos like x16r. They can be used by all CPU archietctures from SSE2 to AVX512.

Initial testing of a prototype AVX2 implementation increased the hashrate of the skein algo by 44%.

When completed the skein model will be propaged to other algos hash functions with a block size of 64 bytes or less. At this time the known ones are: skein, jh, luffa, cube, hamsi, shabal and whirlpool.

The same naming sheme will be used for all algos thereby removing the sph prefix from the names of non-vectored hash functions.

JayDDee commented 4 years ago

Skein midstate prehash for AVX2 has been tested with skein, skein2, skunk and the x16r group of algos. I even got a successful test on skunk (28 min share TTF!)

JayDDee commented 4 years ago

Skein appears to be the only prehash candidate that required special coding to make it work. The others seem to be working so there's no need for custom functions for them.

Skein prehash improved skein 40% and skein2 30%. Will be in the next release.

JayDDee commented 4 years ago

cpuminer-opt-3.12.3 is released.