Closed RMCQAZ closed 10 months ago
Nice work, could you please also provide the result of bench
command information of the bmi2
binary and avx512f
binary?
If you don't mind, you can add your name to the AUTHOR file following alphabetical order.
Cmd: bench 4096 64 10000 current movetime
AVX512F: Total time (ms) : 10004 Nodes searched : 103364416 Nodes/second : 10332308
BMI2: Total time (ms) : 10002 Nodes searched : 99180949 Nodes/second : 9916111
They give the same result: bestmove h2e2 ponder h9g7
Compiler: Debian Clang 14.0.6
That's good, you also need to provide the default bench
command using one thread so we can confirm the implementation is correct.
Cmd: bench
AVX512F: Total time (ms) : 11864 Nodes searched : 1553955 Nodes/second : 130980
BMI2: Total time (ms) : 11937 Nodes searched : 1553955 Nodes/second : 130179
Thank you for providing the information, I'll merge this now.
Looks like this one is broken since https://github.com/official-pikafish/Pikafish/commit/9b7b750718c85605f486e8bfd185b89d36e9b61a Any thoughts?
Fixed it by correcting the permuting order for AVX512F.
The CPU with only AVX512F but no AVX512BW can also use partial AVX512 acceleration. Tested on Phi-7230 CPU, it is about 10% faster than AVX2.