Open daducci opened 2 years ago
It should be possible and it should be working.
Could try to run the function check_openmp()
defined in the setup.py
file here.
It returns 0 if linking against OpenMP works and 1 if not.
It is, but you need to compile with not the default mac toolchain since that does not work (I think they have a non standard implementation or something similar). I am personally using another module to check for it automatically here https://github.com/samuelstjean/spams-python/blob/master/setup.py#L114 Seems like it was also added to a special package to do just that if it's cleaner to use https://github.com/astropy/extension-helpers
Thanks @samuelstjean, I'll look into what you coded, we should reuse it.
extension-helpers
seems great, the github repository seems to be still active but there has not been any release on PyPI since december 2019, and that worries me a little.
I am personally using the older version, directly stolen from astropy, so just including this one file could be an idea. If stuff breaks we would have to fix it though, or copy an updated one, so it should not require too much work if it works as is and nobody touches it.
I asked them about releasing on PyPI more often (c.f. https://github.com/astropy/extension-helpers/issues/38).
Depending on their answer, we will either directly integrate their file as you did or just use extension-helpers
as a requirement (which would be less maintenance on our side if possible).
Should be fixed by #20
@daducci could you try with the corresponding version (c.f. below) and tell me if the mutli-threading works?
pip install git+https://github.com/getspams/spams-python.git@fix_openmp_req_macos
Sorry for the delay. I just tested this version, but it does not work, only 1 core is used. I found that I could fix the issue with OpenMP
on OSX if I change the following in the version on this repository:
setup.py:34 from
['cc', '-fopenmp' ,'-o',
to
['cc', '-Xpreprocessor', '-fopenmp', '-lomp', '-o',
setup.py:133 from
if check_openmp() == 0:
cc_flags.append('-fopenmp')
link_flags.append('-fopenmp')
to
if check_openmp() == 0:
cc_flags.append('-Xpreprocessor')
cc_flags.append('-fopenmp')
link_flags.append('-lomp')
@daducci Thanks for the report. I'll do some tests (and eventually report to https://github.com/astropy/extension-helpers if needed).
Seems like someone also found it https://github.com/astropy/extension-helpers/issues/40, my guess is that those flags are for gcc, and apple is using clang as the default compiler. For a very long time, their version had issues or something else, and people had to install a third party gcc to get stuff working (like by using homebrew for example).
Seems like the easy way here is to let extension helper do the job and put the correct flag for the correct compiler, or hack it in somehow, but that sounds like trouble in the future to make it work properly, and on old versions also at the same time
I am digging a bit on this. As long as extension-helpers
does not support this (no answer on the issue), it will be tricky on our part.
setuptools.command.build_ext.new_compiler
does not discriminate^[1] between gcc
and clang
(even less between Apple clang
and LLVM clang
I would guess). So we will have to do the check manually, and edit the flags accordingly (after calling extension-helpers
) for the moment.
[1] See
$ python setup.py build_ext --help-compiler
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -c test_openmp.c -o objects/test_openmp.o -fopenmp
x86_64-linux-gnu-gcc -pthread objects/test_openmp.o -o test_openmp -fopenmp
Compiling Cython/C/C++ extension with OpenMP support
List of available compilers:
--compiler=bcpp Borland C++ Compiler
--compiler=cygwin Cygwin port of GNU C Compiler for Win32
--compiler=mingw32 Mingw32 port of GNU C Compiler for Win32
--compiler=msvc Microsoft Visual C++
--compiler=unix standard UNIX-style compiler
I am trying to fix extension-helpers
regarding this matter, c.f. https://github.com/gdurif/extension-helpers/tree/support_apple_clang
I'll keep you updated (I will propose a PR when it is working).
Update: PR on the way https://github.com/astropy/extension-helpers/pull/42
@daducci could you tell me if installing the modified version of extension-helpers
before installing spams
solves the problem?
pip install git+https://github.com/gdurif/extension-helpers.git@support_apple_clang
pip install git+https://github.com/getspams/spams-python.git@fix_openmp_req_macos
No, in my case only 1 core is used. :-( Multiple cores are used only when installing the version with the manual patch I suggested few days ago. Dunno why this one does not work, as it seems all the modifications in there match mine.
@daducci ok thanks, I'll continue digging. Since the manual patch works, I hope we are not foo far from finding the solution!
@daducci I think the issue was found (https://github.com/astropy/extension-helpers/pull/42#issuecomment-1090845149). You are using the new Apple CPUs if I recall correctly ? (hence not Intel-based, hence a different location for libomp
)
Yes, I am! Sorry, I didn't think Apple would change paths/etc... based on the vendor of its CPUs!
No problem, we will find a solution, the person suggested one in his response.
@daducci Could you tell me the output of brew --prefix libomp
on your machine please?
Edit: and also python -c "import sys; print(sys.platform)"
? (just to be sure) Thanks
brew --prefix libomp
--> /usr/local/opt/libomp
python -c "import sys; print(sys.platform)"
--> darwin
@daducci could you retry ? (maybe uninstall every thing before or try in a clean environment)
pip install git+https://github.com/gdurif/extension-helpers.git@support_apple_clang
pip install git+https://github.com/getspams/spams-python.git@fix_openmp_req_macos
On my MacOS VM, multi-threading seems to work.
Hi @gdurif ,
sorry for the long delay, I have been off for Easter. Unfortunately, it doesn't work; I can make it work only when using the "manual trick" from a previous message. Here is the code snippet I use to benchmark:
import time
import spams
import numpy as np
from tqdm import tqdm
# data generation
np.random.seed(0)
X = np.asfortranarray(np.random.normal(size=(100,200)))
X = np.asfortranarray(X / np.tile(np.sqrt((X*X).sum(axis=0)),(X.shape[0],1)))
D = np.asfortranarray(np.random.normal(size=(100,1000)))
D = np.asfortranarray(D / np.tile(np.sqrt((D*D).sum(axis=0)),(D.shape[0],1)))
tic = time.time()
for i in tqdm(range(500)):
alpha = spams.lasso( X, D=D, lambda1=0.15, numThreads=-1 )
print( f"{time.time() - tic:.2f}s")
If needed, we can chat one of these days and make some test in real time. Just let me know if this can help!
Hi @daducci No problem ;-) Thanks for the code snippet. I will use it to improve my tests. And also we could definitely chat about it, I'll send you an email. Best
Hi guys, what about resuming this discussion to try getting to the bottom of it?
Looks like it will be complicated for now, since building on arm is done by cross-compiling with the thing I am using, and that means homebrew pulls stuff for x64 instead. It does build without putting in the libs, so people might be able to work around it by installing openblas themselves. If you want to try it (or even the old x64 version), that should be it https://we.tl/t-zeELwQpqmC And of course it could not be tested for the same reasons, so it might not even start.
Well if we wait a bit normally there will be some buildbots directly on arm64 for macs. As of now, this means it will be impossible to install a premade blas, unless we want to compile our own each time, but I would not do that because it means we would need to check the test and fix ourselves every little bugs since we build from source. Hopefully this will fix it https://github.com/pypa/cibuildwheel/issues/1204
As of now, I am using the conda version on windows, the cent os version on linux and the homebrew version for mac to avoid that, but that means we won't have a blas version for mac M1 until someone else makes it for me. With a bit of luck maybe the M1 version can use the libs from the x64 version as it apparently emulates stuff behind the scene for you, so if those builds work we could do that (and normally it should be multithreaded, but I don't have a mac so...).
Well I tried some new arm mac build, but I don't have access to one to check if everything works as supposed. They are built with some emulation layer, so I can not test them either on the build machines. It should be fine if you want to give it a try before putting them back here https://github.com/samuelstjean/spams-python/releases
Is it possible to use multiple threads/cores under OSX? It works on Linux, but I'm not able to use this feature on the Mac.