Open joshring opened 7 years ago
The benchmark is a FIB sequence, duplicating this work over many cores even with 100% efficiency will never yield any speedup.
Work division is needed to see a speedup, I propose a simple, valid benchmark:
Multiplication of many elements of a list, divide the list into chunks and give a chunk to each thread.
Alternatively use a loop to multiply simple numbers many times (>10^9), and divide the loop iterations among threads.
>10^9
The good news is that if work duplication is currently a similar speed to single threaded code, division of work will already be faster.
The benchmark is a FIB sequence, duplicating this work over many cores even with 100% efficiency will never yield any speedup.
Work division is needed to see a speedup, I propose a simple, valid benchmark:
Multiplication of many elements of a list, divide the list into chunks and give a chunk to each thread.
Alternatively use a loop to multiply simple numbers many times (
>10^9
), and divide the loop iterations among threads.The good news is that if work duplication is currently a similar speed to single threaded code, division of work will already be faster.