Closed giordano closed 2 years ago
Ok, now that we have a much better performance I increased the default number of steps to 10^9 (to match the other compiled languages) and I finally get more reasonable results:
[cceamgi@login13 rust_pi_dir]$ for threads in 1 2 3 6 9 12 18 36; do OMP_NUM_THREADS=${threads} ./pi; done
Calculating PI using:
1000000000 slices
1 threads
Obtained value of PI: 3.1415926535899708
Time Elapsed: 1.859772146 seconds
Calculating PI using:
1000000000 slices
2 threads
Obtained value of PI: 3.141592653589901
Time Elapsed: 0.930759049 seconds
Calculating PI using:
1000000000 slices
3 threads
Obtained value of PI: 3.1415926535899623
Time Elapsed: 0.620628970 seconds
Calculating PI using:
1000000000 slices
6 threads
Obtained value of PI: 3.141592653589683
Time Elapsed: 0.310554742 seconds
Calculating PI using:
1000000000 slices
9 threads
Obtained value of PI: 3.141592653589656
Time Elapsed: 0.207469345 seconds
Calculating PI using:
1000000000 slices
12 threads
Obtained value of PI: 3.1415926535898593
Time Elapsed: 0.157268319 seconds
Calculating PI using:
1000000000 slices
18 threads
Obtained value of PI: 3.141592653589815
Time Elapsed: 0.105528349 seconds
Calculating PI using:
1000000000 slices
36 threads
Obtained value of PI: 3.1415926535898224
Time Elapsed: 0.59943666 seconds
I wonder if threading has some overhead that gets noticeable with few number of steps.
Benchmarks on Myriad. Before PR:
After PR:
I didn't look at the code, but I'm not sure this is using a single thread:
~0.18 seconds is close to performance of 12 threads, definitely way faster than 2 threads.