Modern Minpack benchmarks

certik commented 1 year ago

One can apply the following patch:

--- a/examples/example_hybrd.f90
+++ b/examples/example_hybrd.f90
@@ -10,7 +10,7 @@ program example_hybrd

     implicit none

-    integer,parameter :: n = 9
+    integer,parameter :: n = 523
     integer,parameter :: ldfjac = n
     integer,parameter :: lr = (n*(n+1))/2

Larger number fails due to https://github.com/lfortran/lfortran/issues/1624 on my computer. The best is to use the upstream modern minpack from https://github.com/fortran-lang/minpack.git, I am using commit c0b5aea9fcd2b83865af921a7a7e881904f8d3c2.

Results with GFortran 11.3.0 (from Spack) on Apple M1 Max:

$ gfortran -O3 -march=native -ffast-math src/minpack.f90 examples/example_hybrd.f90 -o hybrd.gf
$ time ./hybrd.gf
[...]
      -0.7070700D+00 -0.7070063D+00 -0.7068325D+00
      -0.7063577D+00 -0.7050615D+00 -0.7015252D+00
      -0.6918946D+00 -0.6657975D+00 -0.5960353D+00
      -0.4164123D+00
./hybrd.gf  0.09s user 0.00s system 96% cpu 0.091 total

And LFortran (latest master commit d27eff0987cafd590086d6a8ca7107b21e46820a):

$ lfortran -c src/minpack.f90 && lfortran --fast examples/example_hybrd.f90 -o hybrd.lf
$ time ./hybrd.lf
[...]
-7.07070007230867881e-01
-7.07006344582209456e-01
-7.06832481717657446e-01
-7.06357705467051677e-01
-7.05061525360883845e-01
-7.01525192607980519e-01
-6.91894627079055691e-01
-6.65797523770207955e-01
-5.96035314727141552e-01
-4.16412300836015881e-01

./hybrd.lf  0.16s user 0.00s system 98% cpu 0.167 total

The results seem to agree exactly and we are 1.8x slower. But we don't use "fast-math" (because I haven't figured out how to enable it in LLVM yet, patches welcome!), so we can also compare against GFortran without -ffast-math:

$ gfortran -O3 -march=native src/minpack.f90 examples/example_hybrd.f90 -o hybrd.gf2
$ time ./hybrd.gf2
      -0.7070700D+00 -0.7070063D+00 -0.7068325D+00
      -0.7063577D+00 -0.7050615D+00 -0.7015252D+00
      -0.6918946D+00 -0.6657975D+00 -0.5960353D+00
      -0.4164123D+00
./hybrd.gf2  0.15s user 0.00s system 97% cpu 0.156 total

Now we are only 7% slower.

One can also try compiling to LLVM and then use Clang, but I am not sure if all optimizations are enable correctly:

$ lfortran -c src/minpack.f90 && lfortran --fast examples/example_hybrd.f90 --show-llvm > x.ll
$ clang -O3 -ffast-math x.ll -L$HOME/repos/lfortran/lfortran/src/runtime/ -llfortran_runtime -Wl,-rpath -Wl,$HOME/repos/lfortran/lfortran/src/runtime/ -o hybrd.lf2
$ time ./hybrd.lf2
-7.07070007230867881e-01
-7.07006344582209456e-01
-7.06832481717657446e-01
-7.06357705467051677e-01
-7.05061525360883845e-01
-7.01525192607980519e-01
-6.91894627079055691e-01
-6.65797523770207955e-01
-5.96035314727141552e-01
-4.16412300836015881e-01

./hybrd.lf2  0.17s user 0.00s system 97% cpu 0.174 total

Since it seems even slower than via LFortran directly.

Smit-create commented 1 year ago

Results on my machine, Mac M1 2020 (8 GB):

GFortran with -ffast-math

% gfortran -O3 -march=native -ffast-math src/minpack.f90 examples/example_hybrd.f90 -o hybrd.gf
% time ./hybrd.gf
...
./hybrd.gf  0.12s user 0.01s system 19% cpu 0.641 total
% time ./hybrd.gf
...
./hybrd.gf  0.12s user 0.01s system 95% cpu 0.133 total

LFortran

% lfortran -c src/minpack.f90 && lfortran --fast examples/example_hybrd.f90 -o hybrd.lf
% time ./hybrd.lf
...
./hybrd.lf  0.22s user 0.00s system 29% cpu 0.766 total
% time ./hybrd.lf
...
./hybrd.lf  0.23s user 0.01s system 97% cpu 0.246 total

GFortran without -ffast-math

% gfortran -O3 -march=native src/minpack.f90 examples/example_hybrd.f90 -o hybrd.gf2
% time ./hybrd.gf2 
...
./hybrd.gf2  0.19s user 0.01s system 27% cpu 0.692 total
% time ./hybrd.gf2
...
./hybrd.gf2  0.18s user 0.01s system 98% cpu 0.191 total

Compiler	Time(s)	Relative Speed (compared to LFortran)
GFortran with `-ffast-math`	0.133	0.54
GFortran without `-ffast-math`	0.191	0.78
LFortran (fast)	0.246	1.0

I'm not sure if the results of lfortran are correct because the lfortran is built for macOS-x86_64 on my machine while I'm having macOS-arm64.

certik commented 1 year ago

@Smit-create yes, you are benchmarking the Rosseta that translates x64->arm64. I think your machine should give similar results to mine. Thanks for double checking it. Btw, you should disable the online communication, which will speedup the first run by about 0.5s for you: https://github.com/lcompilers/lpython#speed-up-integration-tests-on-macos.

Pranavchiku commented 1 year ago

Results on my machine, Ubuntu 22.04 (8GB):

GFortran with -ffast-math

$ gfortran -O3 -march=native -ffast-math src/minpack.f90 examples/example_hybrd.f90 -o hybrd.gf
$ time ./hybrd.gf 
....
real    0m0.196s
user    0m0.171s
sys 0m0.008s

LFortran

$ lfortran -c src/minpack.f90 && lfortran --fast examples/example_hybrd.f90 -o hybrd.lf
$ time ./hybrd.lf 
...
real    0m0.842s
user    0m0.834s
sys 0m0.008s

GFortran without -ffast-math


$ gfortran -O3 -march=native src/minpack.f90 examples/example_hybrd.f90 -o hybrd.gf2
$ time ./hybrd.gf2
real    0m0.474s
user    0m0.469s
sys 0m0.005s

Pranavchiku commented 1 year ago

LFortran, clang with -ffast-math

$ lfortran -c src/minpack.f90 && lfortran --fast examples/example_hybrd.f90 --show-llvm > x.ll
$ clang -O3 -march=native -ffast-math x.ll -L"/home/pranavchiku/lfortran/src/bin/../runtime" -Wl, rpath,"/home/pranavchiku/lfortran/src/bin/../runtime" -llfortran_runtime -lm -o hybrd.lf2
$ time ./hybrd.lf2 
...
real    0m0.248s
user    0m0.244s
sys 0m0.004s

certik commented 1 year ago

@Pranavchiku do I understand your timings correctly, that by using Clang's LLVM optimizer, you are able to get 0.248s vs 0.196s for GFortran? That's about 26% slower, which is amazingly good for LFortran (as a starting point), if true.

Pranavchiku commented 1 year ago

Yes, correct. This is a great speed at this stage.

czgdp1807 commented 1 year ago

I think every machine is different. On mine (macOS Ventura 13.3.1 Apple M1 8 GB) the maximum I can go to is 825.

LFortran commit - d27eff098

--- a/examples/example_hybrd.f90
+++ b/examples/example_hybrd.f90
@@ -10,7 +10,7 @@ program example_hybrd

     implicit none

-    integer,parameter :: n = 9
+    integer,parameter :: n = 825
     integer,parameter :: ldfjac = n
     integer,parameter :: lr = (n*(n+1))/2

LFortran

(lf) 21:46:12:~/lfortran_project/minpack % lfortran -c src/minpack.f90 && lfortran --fast examples/example_hybrd.f90 -o hybrd.lf
./hybrd.lf  0.64s user 0.01s system 99% cpu 0.646 total

GFortran

(arm-compilers) 21:49:19:~/lfortran_project/minpack % gfortran -O3 -march=native -ffast-math src/minpack.f90 examples/example_hybrd.f90 -o hybrd.gf
./hybrd.gf  0.35s user 0.01s system 99% cpu 0.361 total

(arm-compilers) 21:50:57:~/lfortran_project/minpack % gfortran -O3 -march=native src/minpack.f90 examples/example_hybrd.f90 -o hybrd.gf2
./hybrd.gf2  0.59s user 0.01s system 99% cpu 0.602 total

Clang

(lf) 21:52:23:~/lfortran_project/minpack % clang -O3 -ffast-math x.ll -L$HOME/lfortran_project/lfortran/src/runtime/ -llfortran_runtime -Wl,-rpath -Wl,$HOME/lfortran_project/lfortran/src/runtime/ -o hybrd.lf2
./hybrd.lf2  0.66s user 0.01s system 99% cpu 0.674 total

Versions

(lf) 21:55:31:~/lfortran_project/minpack % clang --version
Apple clang version 14.0.3 (clang-1403.0.22.14.1)
Target: arm64-apple-darwin22.4.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
(lf) 21:55:32:~/lfortran_project/minpack % conda activate arm-compilers
(arm-compilers) 21:55:39:~/lfortran_project/minpack % gfortran --version
GNU Fortran (GCC) 11.0.1 20210403 (experimental)
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

lfortran / lfortran

Modern Minpack benchmarks #1625