OpenMathLib / OpenBLAS

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
http://www.openblas.net
BSD 3-Clause "New" or "Revised" License
6.43k stars 1.51k forks source link

replace default YIELDING by sched_yield with "nop" for RISCV64 #4990

Closed CheryDan closed 1 week ago

CheryDan commented 1 week ago

4985 when testing the nop YIELDING in bench_symm at SG2042 by perf ,it got the performance improvement of cache and branch misses .

perf arg sched_yield nop ratio
duration_time 858,189,070 846,633,123 0.986534498
task-clock 50,732.36 50,121.99 0.987968823
cycles 101,461,057,914 100,240,372,989 0.987968932
instructions 68,463,403,439 62,824,853,943 0.917641408
cache-references 1,180,192,443 515,655,574 0.436924992
cache-misses 1,180,195,818 515,658,928 0.436926585
branches 6,795,994,978 6,675,145,698 0.982217574
branch-misses 67,157,179 4,502,002 0.067036794
L1-dcache-loads 19,105,443,229 13,524,142,477 0.707868554
L1-dcache-load-misses 55,661,342 9,939,946 0.178578986
LLC-load-misses 13,924,594 13,411,098 0.96312309
LLC-loads 1,180,208,048 515,670,980 0.436932269
martin-frbg commented 1 week ago

thank you