With this small change (put @simd in front of the for loop), on Myriad I can match the performance of the Fortran example compiled with ifort:
[cceamgi@login12 fortran_pi_dir]$ ./run.sh
rm -f *.o pi
make -f Makefile.intel
make[1]: Entering directory `/lustre/home/cceamgi/repo/pi_examples/fortran_pi_dir'
ifort -O2 -xHost -o pi pi.f90
make[1]: Leaving directory `/lustre/home/cceamgi/repo/pi_examples/fortran_pi_dir'
Calculating PI using:
1000000000 slices
1 process
Obtained value of PI: 3.1415926536
Time taken: 0.87264 seconds
[cceamgi@login12 julia_pi_dir]$ julia pi_serial.jl
Calculating PI using:
1000000000 slices
1 worker(s)
Obtained value of PI: 3.1415926535898455
Time taken: 0.8804278373718262 seconds
With this small change (put
@simd
in front of thefor
loop), on Myriad I can match the performance of the Fortran example compiled withifort
: