spcl / npbench

NPBench - A Benchmarking Suite for High-Performance NumPy
BSD 3-Clause "New" or "Revised" License
73 stars 26 forks source link

`python run_benchmark.py -b cholesky2 -f dace_cpu` does not work on Macbook #25

Open pratyai opened 3 days ago

pratyai commented 3 days ago

I have been trying to run the benchmarks on my 2019 (Intel) Macbook and this particular benchmark + framework combination seems broken on that machine. I get the following error (truncated; full error):

[ 25%] Linking CXX shared library libfusion.dylib
Undefined symbols for architecture x86_64:
  "_LAPACKE_dpotrf", referenced from:
      __program_fusion_internal(fusion_state_t*, double*, long long) in fusion.cpp.o
ld: symbol(s) not found for architecture x86_64
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [libfusion.dylib] Error 1
make[1]: *** [CMakeFiles/fusion.dir/all] Error 2

On the other hand, on a linux machine, everything seems to go just fine. I have not tried with the newer (M1) Macs, so perhaps it is only present in the older Macbooks.


Side-note: the dependency installation does not exactly work as stated in README.md for me. I now have the following changes to explicitly include additional dependencies:

diff --git a/requirements.txt b/requirements.txt
index 5c2e18f..83cbdf9 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,5 +1,9 @@
-matplotlib
-numpy
-pandas
-pygount
+matplotlib~=3.9.2
+numpy~=1.26.4
+pandas~=2.2.2
+pygount~=1.8.0
 scipy
+dace~=0.16.1
+numba~=0.60.0
+sympy~=1.13.2
+npbench~=0.1
tbennun commented 2 days ago

This looks like it is related to MacOS and DaCe. Have you tried the MacOS instructions in https://spcldace.readthedocs.io/en/latest/setup/installation.html#common-issues-with-the-dace-python-module ?

pratyai commented 1 day ago

I am using ~/.dace.conf to 1) use openblas library (installed through homebrew), and 2) select g++ or clang++ compilers (both installed through homebrew). Not quite sure if there is anything else significant from the common-issues page re. MacOS usage.

I show the errors for both compiler options below --- although, for GCC, all the benchmark programs break the same way, where for Clang, only a small number of benchmarks like cholesky2 break.

Clang++

~ $ clang++ --version
clang version 18.1.8
Target: x86_64-apple-darwin24.0.0
Thread model: posix
InstalledDir: /opt/local/libexec/llvm-18/bin
~ $ cat ~/.dace.conf
compiler:
  cpu:
    executable: clang++
    args: -I/opt/local/include/openblas -L/opt/local/lib
(brandnewvenv) ~/g/npbench (main|✚2) $ rm -rf .dacecache
(brandnewvenv) ~/g/npbench (main|✚2) $ python run_benchmark.py -b cholesky2 -f dace_cpu 2>&1
***** Testing DaCe CPU with cholesky2 on the S dataset *****
NumPy - default - validation: 25ms
Failed to compile DaCe cpu fusion implementation.
Compiler failure:
[ 25%] Building CXX object CMakeFiles/fusion.dir/Users/pmz/gitspace/npbench/.dacecache/fusion/src/cpu/fusion.cpp.o
clang++: warning: argument unused during compilation: '-L/opt/local/lib' [-Wunused-command-line-argument]
[ 50%] Linking CXX shared library libfusion.dylib
Undefined symbols for architecture x86_64:
  "_LAPACKE_dpotrf", referenced from:
      __program_fusion_internal(fusion_state_t*, double*, long long) in fusion.cpp.o
ld: symbol(s) not found for architecture x86_64
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [libfusion.dylib] Error 1
make[1]: *** [CMakeFiles/fusion.dir/all] Error 2
make: *** [all] Error 2

Traceback (most recent call last):
  File "/Users/pmz/gitspace/npbench/brandnewvenv/lib/python3.12/site-packages/dace/codegen/compiler.py", line 232, in configure_and_compile
    _run_liveoutput("cmake --build . --config %s" % (Config.get('compiler', 'build_type')),
  File "/Users/pmz/gitspace/npbench/brandnewvenv/lib/python3.12/site-packages/dace/codegen/compiler.py", line 416, in _run_liveoutput
    raise subprocess.CalledProcessError(process.returncode, command, output.getvalue())
subprocess.CalledProcessError: Command 'cmake --build . --config RelWithDebInfo' returned non-zero exit status 2.

During handling of the above exception, another exception occurred:

< ...several more similar errors skipped... >

G++

~ $ g++-14 --version
g++-14 (Homebrew GCC 14.2.0) 14.2.0
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

~ $ cat ~/.dace.conf
compiler:
  cpu:
    executable: g++-14
    args: -I/opt/local/include/openblas -L/opt/local/lib
(brandnewvenv) ~/g/npbench (main|✚2) $ rm -rf .dacecache
(brandnewvenv) ~/g/npbench (main|✚2) [0|1] $ python run_benchmark.py -b cholesky2 -f dace_cpu 2>&1 | head -n 20
***** Testing DaCe CPU with cholesky2 on the S dataset *****
NumPy - default - validation: 25ms
Failed to compile DaCe cpu fusion implementation.
Compiler failure:
[ 25%] Building CXX object CMakeFiles/fusion.dir/Users/pmz/gitspace/npbench/.dacecache/fusion/src/cpu/fusion.cpp.o
In file included from /usr/local/Cellar/gcc/14.2.0/include/c++/14/cstdio:42,
                 from /Users/pmz/gitspace/npbench/brandnewvenv/lib/python3.12/site-packages/dace/codegen/../runtime/include/dace/dace.h:6,
                 from /Users/pmz/gitspace/npbench/.dacecache/fusion/src/cpu/fusion.cpp:2:
/usr/local/Cellar/gcc/14.2.0/lib/gcc/current/gcc/x86_64-apple-darwin23/14/include-fixed/stdio.h:83:8: error: 'FILE' does not name a type
   83 | extern FILE *__stdinp;
      |        ^~~~
/usr/local/Cellar/gcc/14.2.0/lib/gcc/current/gcc/x86_64-apple-darwin23/14/include-fixed/stdio.h:81:1: note: 'FILE' is defined in header '<cstdio>'; this is probably fixable by adding '#include <cstdio>'
   80 | #include <sys/_types/_seek_set.h>
  +++ |+#include <cstdio>
   81 |

< ...many, many more similar errors skipped... >