Closed giordano closed 2 years ago
I noticed that on Myriad the CPU FLoops example was ~10% slower than the other Julia threaded example:
[cceamgi@login12 julia_floops_pi_dir]$ JULIA_NUM_THREADS=18 julia --project pi_floops.jl 100000000000 Warming up...done. [0.549s] Calculating PI using: 100000000000 slices 18 thread(s) Obtained value of PI: 3.141592653589794 Time taken: 5.472 seconds [cceamgi@login13 julia_threads_dir]$ JULIA_NUM_THREADS=18 ./run.sh 100000000000 Warming up...done. [0.153s] Calculating PI using: 100000000000 slices 18 thread(s) Obtained value of PI: 3.1415926535897793 Time taken: 4.871 seconds
but we can recover basically the same performance by tweaking the value of basesize of ThreadedEx. With this PR:
basesize
ThreadedEx
[cceamgi@login12 julia_floops_pi_dir]$ JULIA_NUM_THREADS=18 julia --project pi_floops.jl 100000000000 Warming up...done. [0.577s] Calculating PI using: 100000000000 slices 18 thread(s) Obtained value of PI: 3.1415926535897922 Time taken: 4.882 seconds
I noticed that on Myriad the CPU FLoops example was ~10% slower than the other Julia threaded example:
but we can recover basically the same performance by tweaking the value of
basesize
ofThreadedEx
. With this PR: