Mysticial / y-cruncher

Bug-Tracking and open-sourced parts of y-cruncher.
189 stars 16 forks source link

Poor and inconsistent parallel performance scaling on Linux. #5

Open Mysticial opened 6 years ago

Mysticial commented 6 years ago

Parallel performance scaling is inconsistent on Linux. It does fairly well on 8-core Ryzen, but terribly on 10-core Skylake X.

While unconfirmed, it seems possible that the non-power-of-two cores doesn't play well with Linux.

It's a fact that y-cruncher has load balancing issues when the number of cores is not a power-of-two. But this isn't too much of an issue on Windows since the scheduler seems to do a fairly good job of time-slicing up the inbalances.


Benchmarks: y-cruncher v0.7.4.9478

Single-threaded benchmarks are included as a baseline. The 1-2% difference between Windows and Linux for the single-threaded benchmarks is expected since different compilers were used. The important part is the parallel speedup.

Core i9 7900X (10 cores, 20 threads) @ 3.8 GHz AVX512, 3.0 GHz cache, 3800 MT/s memory

OS Framework Seconds CPU Utilization Speedup
Windows 10 (14393) Single Threaded 295.773 5.00% -
Windows 10 (14393) Push Pool 32.598 87.05% 9.07x
Windows 10 (14393) Cilk Plus 36.874 72.68% 8.02x
Ubuntu 17.04 Single Threaded 300.900 5.00% -
Ubuntu 17.04 Push Pool 38.511 77.44% 7.81x
Ubuntu 17.04 Cilk Plus 34.923 86.25% 8.62x

Ryzen 7 1800X (8 cores, 16 threads) @ 3.8 GHz, 2666 MT/s memory

OS Framework Seconds CPU Utilization Speedup
Windows 10 (14393) Single Threaded 671.752 6.25% -
Windows 10 (14393) Push Pool 85.321 93.64% 7.87x
Windows 10 (14393) Cilk Plus 93.344 84.92% 7.20x
Ubuntu 17.04 Single Threaded 677.497 6.24% -
Ubuntu 17.04 Push Pool 87.291 92.26% 7.76x
Ubuntu 17.04 Cilk Plus 89.820 94.68% 7.54x