Closed gsobala closed 3 years ago
Thanks. For me it seems to be about 5% slower.
The sparse multiplication code should now be faster. Please try again if you have time.
-DTRANSPOSE is now the default. If you want to try without, remove the #define TRANSPOSE line: https://github.com/syzygy1/Cfish/blob/421c10e9ab814746d2af3927122096347d29e47b/src/nnue.c#L102-L103 (Or just remove one character from TRANSPOSE.)
Just some feedback: the new NEON sparse multiplication code is about 10% slower if enabled by -DTRANSPOSE on a raspberry pi 64-bit armv8 compile.