getkeops / keops

KErnel OPerationS, on CPUs and GPUs, with autodiff and without memory overflows
https://www.kernel-operations.io
MIT License
1.02k stars 65 forks source link

pykeops broken on Arch Linux and Colab with CUDA 11 #142

Closed scottgigante closed 3 years ago

scottgigante commented 3 years ago
❯ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:09_PDT_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0
>>> import pykeops
>>> pykeops.clean_pykeops()          # just in case old build files are still present
Cleaning /home/scottgigante/.cache/pykeops-1.4.2-cpython-38/...
    - /home/scottgigante/.cache/pykeops-1.4.2-cpython-38/build-libKeOpstorch91904fe69f has been removed.
>>> pykeops.test_torch_bindings()    # perform the compilation
Compiling libKeOpstorch91904fe69f in /home/scottgigante/.cache/pykeops-1.4.2-cpython-38:
       formula: Sum_Reduction(SqNorm2(x - y),1)
       aliases: x = Vi(0,3); y = Vj(1,3);
       dtype  : float32
... cmake: /usr/lib/libc.so.6: version `GLIBC_2.33' not found (required by cmake)
cmake: /usr/lib/libc.so.6: version `GLIBC_2.32' not found (required by /usr/lib/libuv.so.1)
cmake: /usr/lib/libc.so.6: version `GLIBC_2.33' not found (required by /usr/lib/libuv.so.1)

--------------------- CMAKE DEBUG -----------------
Command '['cmake', '/home/scottgigante/.local/lib/python3.8/site-packages/pykeops', "-DCMAKE_BUILD_TYPE='Release'", "-DFORMULA_OBJ='Sum_Reduction(SqNorm2(x - y),1)'", "-DVAR_ALIASES='auto x = Vi(0,3); auto y = Vj(1,3); '", "-Dshared_obj_name='libKeOpstorch91904fe69f'", "-D__TYPE__='float'", "-DPYTHON_LANG='torch'", "-DPYTHON_EXECUTABLE='/usr/bin/python'", "-DPYBIND11_PYTHON_VERSION='3.8'", '-DC_CONTIGUOUS=1', '-D__TYPEACC__=float', '-DSUM_SCHEME=1', '-DENABLECHUNK=1', '-DPYTORCH_ROOT_DIR=/home/scottgigante/.local/lib/python3.8/site-packages/torch', '-D_GLIBCXX_USE_CXX11_ABI=0', "-DcommandLine=cmake /home/scottgigante/.local/lib/python3.8/site-packages/pykeops -DCMAKE_BUILD_TYPE='Release' -DFORMULA_OBJ='Sum_Reduction(SqNorm2(x - y),1)' -DVAR_ALIASES='auto x = Vi(0,3); auto y = Vj(1,3); ' -Dshared_obj_name='libKeOpstorch91904fe69f' -D__TYPE__='float' -DPYTHON_LANG='torch' -DPYTHON_EXECUTABLE='/usr/bin/python' -DPYBIND11_PYTHON_VERSION='3.8' -DC_CONTIGUOUS=1 -D__TYPEACC__=float -DSUM_SCHEME=1 -DENABLECHUNK=1 -DPYTORCH_ROOT_DIR=/home/scottgigante/.local/lib/python3.8/site-packages/torch -D_GLIBCXX_USE_CXX11_ABI=0"]' returned non-zero exit status 1.

--------------------- ----------- -----------------
cmake: /usr/lib/libc.so.6: version `GLIBC_2.33' not found (required by cmake)
cmake: /usr/lib/libc.so.6: version `GLIBC_2.32' not found (required by /usr/lib/libuv.so.1)
cmake: /usr/lib/libc.so.6: version `GLIBC_2.33' not found (required by /usr/lib/libuv.so.1)

--------------------- MAKE DEBUG -----------------
Command '['cmake', '--build', '.', '--target', 'libKeOpstorch91904fe69f', '--', 'VERBOSE=1']' returned non-zero exit status 1.

--------------------- ----------- -----------------
Done.
Compiling libKeOpstorch91904fe69f in /home/scottgigante/.cache/pykeops-1.4.2-cpython-38:
       formula: Sum_Reduction(SqNorm2(x - y),1)
       aliases: x = Vi(0,3); y = Vj(1,3);
       dtype  : float32
... cmake: /usr/lib/libc.so.6: version `GLIBC_2.33' not found (required by cmake)
cmake: /usr/lib/libc.so.6: version `GLIBC_2.32' not found (required by /usr/lib/libuv.so.1)
cmake: /usr/lib/libc.so.6: version `GLIBC_2.33' not found (required by /usr/lib/libuv.so.1)

--------------------- CMAKE DEBUG -----------------
Command '['cmake', '/home/scottgigante/.local/lib/python3.8/site-packages/pykeops', "-DCMAKE_BUILD_TYPE='Debug'", "-DFORMULA_OBJ='Sum_Reduction(SqNorm2(x - y),1)'", "-DVAR_ALIASES='auto x = Vi(0,3); auto y = Vj(1,3); '", "-Dshared_obj_name='libKeOpstorch91904fe69f'", "-D__TYPE__='float'", "-DPYTHON_LANG='torch'", "-DPYTHON_EXECUTABLE='/usr/bin/python'", "-DPYBIND11_PYTHON_VERSION='3.8'", '-DC_CONTIGUOUS=1', '-D__TYPEACC__=float', '-DSUM_SCHEME=1', '-DENABLECHUNK=1', '-DPYTORCH_ROOT_DIR=/home/scottgigante/.local/lib/python3.8/site-packages/torch', '-D_GLIBCXX_USE_CXX11_ABI=0', "-DcommandLine=cmake /home/scottgigante/.local/lib/python3.8/site-packages/pykeops -DCMAKE_BUILD_TYPE='Debug' -DFORMULA_OBJ='Sum_Reduction(SqNorm2(x - y),1)' -DVAR_ALIASES='auto x = Vi(0,3); auto y = Vj(1,3); ' -Dshared_obj_name='libKeOpstorch91904fe69f' -D__TYPE__='float' -DPYTHON_LANG='torch' -DPYTHON_EXECUTABLE='/usr/bin/python' -DPYBIND11_PYTHON_VERSION='3.8' -DC_CONTIGUOUS=1 -D__TYPEACC__=float -DSUM_SCHEME=1 -DENABLECHUNK=1 -DPYTORCH_ROOT_DIR=/home/scottgigante/.local/lib/python3.8/site-packages/torch -D_GLIBCXX_USE_CXX11_ABI=0"]' returned non-zero exit status 1.

--------------------- ----------- -----------------
cmake: /usr/lib/libc.so.6: version `GLIBC_2.33' not found (required by cmake)
cmake: /usr/lib/libc.so.6: version `GLIBC_2.32' not found (required by /usr/lib/libuv.so.1)
cmake: /usr/lib/libc.so.6: version `GLIBC_2.33' not found (required by /usr/lib/libuv.so.1)

--------------------- MAKE DEBUG -----------------
Command '['cmake', '--build', '.', '--target', 'libKeOpstorch91904fe69f', '--', 'VERBOSE=1']' returned non-zero exit status 1.

--------------------- ----------- -----------------
Done.
Traceback (most recent call last):
  File "/home/scottgigante/.local/lib/python3.8/site-packages/pykeops/test/install.py", line 57, in test_torch_bindings
    if torch.allclose(my_conv(x, y).view(-1), torch.tensor(expected_res).type(torch.float32)):
  File "/home/scottgigante/.local/lib/python3.8/site-packages/pykeops/torch/generic/generic_red.py", line 421, in __call__
    out = GenredAutograd.apply(self.formula, self.aliases, backend, self.dtype,
  File "/home/scottgigante/.local/lib/python3.8/site-packages/pykeops/torch/generic/generic_red.py", line 29, in forward
    myconv = LoadKeOps(formula, aliases, dtype, 'torch', optional_flags+include_dirs).import_module()
  File "/home/scottgigante/.local/lib/python3.8/site-packages/pykeops/common/keops_io.py", line 50, in import_module
    return importlib.import_module(self.dll_name)
  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 973, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'libKeOpstorch91904fe69f'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/scottgigante/.local/lib/python3.8/site-packages/pykeops/test/install.py", line 68, in test_torch_bindings
    print(my_conv(x, y))
  File "/home/scottgigante/.local/lib/python3.8/site-packages/pykeops/torch/generic/generic_red.py", line 421, in __call__
    out = GenredAutograd.apply(self.formula, self.aliases, backend, self.dtype,
  File "/home/scottgigante/.local/lib/python3.8/site-packages/pykeops/torch/generic/generic_red.py", line 29, in forward
    myconv = LoadKeOps(formula, aliases, dtype, 'torch', optional_flags+include_dirs).import_module()
  File "/home/scottgigante/.local/lib/python3.8/site-packages/pykeops/common/keops_io.py", line 50, in import_module
    return importlib.import_module(self.dll_name)
  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 973, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'libKeOpstorch91904fe69f'
scottgigante commented 3 years ago

glibc version is 2.32:

#include <stdio.h>
#include <gnu/libc-version.h>
int main (void) { puts (gnu_get_libc_version ()); return 0; }

gives

2.32
annavaughan commented 3 years ago

I'm having the same issue on google colab - this was working fine until yesterday

!pip install pykeops
import pykeops
pykeops.clean_pykeops()
pykeops.test_torch_bindings() 

run in a fresh colab session gives

`Cleaning /root/.cache/pykeops-1.4.2-cpython-37/...


Done. Compiling libKeOpstorch14d9f84160 in /root/.cache/pykeops-1.4.2-cpython-37: formula: Sum_Reduction(SqNorm2(x - y),1) aliases: x = Vi(0,3); y = Vj(1,3); dtype : float32 ... -- The CUDA Host CXX Compiler: /usr/bin/c++ -- Compute properties automatically set to: -DMAXIDGPU=0;-DMAXTHREADSPERBLOCK0=1024;-DSHAREDMEMPERBLOCK0=49152 -- Autodetected CUDA architecture(s): 6.0 -- Using shared_obj_name: libKeOpstorch14d9f84160 -- First i variables detected is 0 -- First j variables detected is 1 -- Compiled formula is Sum_Reduction(SqNorm2(x - y),1); auto x = Vi(0,3); auto y = Vj(1,3); where the number of args is 2. -- pybind11 v2.6.1 -- Configuring done -- Generating done -- Build files have been written to: /root/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch14d9f84160

--------------------- MAKE DEBUG ----------------- Command '['cmake', '--build', '.', '--target', 'libKeOpstorch14d9f84160', '--', 'VERBOSE=1']' returned non-zero exit status 2. /usr/local/lib/python2.7/dist-packages/cmake/data/bin/cmake -H/usr/local/lib/python3.7/dist-packages/pykeops -B/root/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch14d9f84160 --check-build-system CMakeFiles/Makefile.cmake 0 /usr/bin/make -f CMakeFiles/Makefile2 libKeOpstorch14d9f84160 make[1]: Entering directory '/root/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch14d9f84160' /usr/local/lib/python2.7/dist-packages/cmake/data/bin/cmake -H/usr/local/lib/python3.7/dist-packages/pykeops -B/root/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch14d9f84160 --check-build-system CMakeFiles/Makefile.cmake 0 /usr/local/lib/python2.7/dist-packages/cmake/data/bin/cmake -E cmake_progress_start /root/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch14d9f84160/CMakeFiles 5 /usr/bin/make -f CMakeFiles/Makefile2 CMakeFiles/libKeOpstorch14d9f84160.dir/all make[2]: Entering directory '/root/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch14d9f84160' /usr/bin/make -f CMakeFiles/keopslibKeOpstorch14d9f84160.dir/build.make CMakeFiles/keopslibKeOpstorch14d9f84160.dir/depend make[3]: Entering directory '/root/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch14d9f84160' cd /root/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch14d9f84160 && /usr/local/lib/python2.7/dist-packages/cmake/data/bin/cmake -E cmake_depends "Unix Makefiles" /usr/local/lib/python3.7/dist-packages/pykeops /usr/local/lib/python3.7/dist-packages/pykeops /root/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch14d9f84160 /root/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch14d9f84160 /root/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch14d9f84160/CMakeFiles/keopslibKeOpstorch14d9f84160.dir/DependInfo.cmake --color= make[3]: Leaving directory '/root/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch14d9f84160' /usr/bin/make -f CMakeFiles/keopslibKeOpstorch14d9f84160.dir/build.make CMakeFiles/keopslibKeOpstorch14d9f84160.dir/build make[3]: Entering directory '/root/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch14d9f84160' [ 20%] Building CUDA object CMakeFiles/keopslibKeOpstorch14d9f84160.dir/keops/core/link_autodiff.cu.o /usr/local/cuda/bin/nvcc -ccbin=/usr/bin/c++ -DCUDA_BLOCK_SIZE=192 -DC_CONTIGUOUS=1 -DENABLECHUNK=1 -DKERNEL_GEOM_TYPE=0 -DKERNEL_SIG_TYPE=0 -DKERNEL_SPHERE_TYPE=0 -DMAXIDGPU=0 -DMAXTHREADSPERBLOCK0=1024 -DMODULE_NAME=libKeOpstorch14d9f84160 -DMODULE_NAME_FSHAPE_SCP=fshape_scp_gaussiangaussiangaussian_unoriented_float -DSHAREDMEMPERBLOCK0=49152 -DSUM_SCHEME=1 -DUSE_CUDA=1 -DUSE_DOUBLE=0 -DUSE_HALF=0 -D_FORCE_INLINES -D_GLIBCXX_USE_CXX11_ABI=0 -DTYPEACC=float -DTYPE=float -DkeopslibKeOpstorch14d9f84160_EXPORTS -I/usr/local/cuda/targets/x86_64-linux/include -I/usr/local/lib/python3.7/dist-packages/pykeops -I/usr/local/lib/python3.7/dist-packages/pykeops/keops -I/root/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch14d9f84160 -I/usr/local/lib/python3.7/dist-packages/torch/include -I/usr/local/lib/python3.7/dist-packages/torch/include/torch/csrc/api/include -gencode arch=compute_60,code=sm_60 --use_fast_math --compiler-options=-fPIC -g -Xcompiler=-fPIC -include libKeOpstorch14d9f84160.h -std=c++14 -x cu -c /usr/local/lib/python3.7/dist-packages/pykeops/keops/core/link_autodiff.cu -o CMakeFiles/keopslibKeOpstorch14d9f84160.dir/keops/core/link_autodiff.cu.o CMakeFiles/keopslibKeOpstorch14d9f84160.dir/build.make:62: recipe for target 'CMakeFiles/keopslibKeOpstorch14d9f84160.dir/keops/core/link_autodiff.cu.o' failed make[3]: Leaving directory '/root/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch14d9f84160' CMakeFiles/Makefile2:331: recipe for target 'CMakeFiles/keopslibKeOpstorch14d9f84160.dir/all' failed make[2]: Leaving directory '/root/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch14d9f84160' CMakeFiles/Makefile2:306: recipe for target 'CMakeFiles/libKeOpstorch14d9f84160.dir/rule' failed make[1]: Leaving directory '/root/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch14d9f84160' Makefile:196: recipe for target 'libKeOpstorch14d9f84160' failed


Done.


ModuleNotFoundError Traceback (most recent call last)

/usr/local/lib/python3.7/dist-packages/pykeops/test/install.py in test_torch_bindings() 56 my_conv = pktorch.Genred(formula, var) ---> 57 if torch.allclose(my_conv(x, y).view(-1), torch.tensor(expected_res).type(torch.float32)): 58 print("\npyKeOps with torch bindings is working!\n")

15 frames

/usr/local/lib/python3.7/dist-packages/pykeops/torch/generic/generic_red.py in call(self, backend, device_id, ranges, args) 421 out = GenredAutograd.apply(self.formula, self.aliases, backend, self.dtype, --> 422 device_id, ranges, self.optional_flags, self.rec_multVar_highdim, args) 423

/usr/local/lib/python3.7/dist-packages/pykeops/torch/generic/generic_red.py in forward(ctx, formula, aliases, backend, dtype, device_id, ranges, optional_flags, rec_multVar_highdim, *args) 28 ---> 29 myconv = LoadKeOps(formula, aliases, dtype, 'torch', optional_flags+include_dirs).import_module() 30

/usr/local/lib/python3.7/dist-packages/pykeops/common/keops_io.py in import_module(self) 49 # os.path.dirname(importlib.util.find_spec(self.dll_name).origin))) ---> 50 return importlib.import_module(self.dll_name)

/usr/lib/python3.7/importlib/init.py in import_module(name, package) 126 level += 1 --> 127 return _bootstrap._gcd_import(name[level:], package, level) 128

/usr/lib/python3.7/importlib/_bootstrap.py in _gcd_import(name, package, level)

/usr/lib/python3.7/importlib/_bootstrap.py in _find_andload(name, import)

/usr/lib/python3.7/importlib/_bootstrap.py in _find_and_loadunlocked(name, import)

ModuleNotFoundError: No module named 'libKeOpstorch14d9f84160'

During handling of the above exception, another exception occurred:

ModuleNotFoundError Traceback (most recent call last)

in () 1 import pykeops 2 pykeops.clean_pykeops() ----> 3 pykeops.test_torch_bindings() /usr/local/lib/python3.7/dist-packages/pykeops/test/install.py in test_torch_bindings() 66 pykeops.config.build_type = "Debug" 67 my_conv = pktorch.Genred(formula, var) ---> 68 print(my_conv(x, y)) 69 /usr/local/lib/python3.7/dist-packages/pykeops/torch/generic/generic_red.py in __call__(self, backend, device_id, ranges, *args) 420 421 out = GenredAutograd.apply(self.formula, self.aliases, backend, self.dtype, --> 422 device_id, ranges, self.optional_flags, self.rec_multVar_highdim, *args) 423 424 if self.dtype in ('float16','half'): /usr/local/lib/python3.7/dist-packages/pykeops/torch/generic/generic_red.py in forward(ctx, formula, aliases, backend, dtype, device_id, ranges, optional_flags, rec_multVar_highdim, *args) 27 optional_flags += ['-DMULT_VAR_HIGHDIM=1'] 28 ---> 29 myconv = LoadKeOps(formula, aliases, dtype, 'torch', optional_flags+include_dirs).import_module() 30 31 # Context variables: save everything to compute the gradient: /usr/local/lib/python3.7/dist-packages/pykeops/common/keops_io.py in import_module(self) 48 # pykeops.config.bin_folder, self.dll_name, 49 # os.path.dirname(importlib.util.find_spec(self.dll_name).origin))) ---> 50 return importlib.import_module(self.dll_name) /usr/lib/python3.7/importlib/__init__.py in import_module(name, package) 125 break 126 level += 1 --> 127 return _bootstrap._gcd_import(name[level:], package, level) 128 129 /usr/lib/python3.7/importlib/_bootstrap.py in _gcd_import(name, package, level) /usr/lib/python3.7/importlib/_bootstrap.py in _find_and_load(name, import_) /usr/lib/python3.7/importlib/_bootstrap.py in _find_and_load_unlocked(name, import_) ModuleNotFoundError: No module named 'libKeOpstorch14d9f84160' `
jeanfeydy commented 3 years ago

Hi @scottgigante , @annavaughan ,

Thanks a lot for your (critical) bug reports! New CUDA releases tend to break a few things here and there... We try to keep the compilation engine as simple as possible to avoid these problems, but they still happen sometimes. After checking in Google Colab, it seems that the bug is also present in our "master" branch: we'll have to investigate a little bit.

Of course, we're going to work on a fix as soon as possible and ideally release new packages by the end of the week. In the short term, a quick fix to get back to work is simply to downgrade CUDA. On Google Colab,

!apt-get install cuda=10.2.89-1
!pip install pykeops
import pykeops
pykeops.clean_pykeops()          # just in case old build files are still present
pykeops.test_numpy_bindings()    # perform the compilation

Should work fine.

We'll come back to you soon! Best regards, Jean

annavaughan commented 3 years ago

Thanks for looking into this @jeanfeydy! Downgrading CUDA works for me for now on colab.

nish03 commented 3 years ago

I was facing the exact same issue as @annavaughan since yesterday. CUDA downgrade works for me as well. Thanks

bcharlier commented 3 years ago

Hi,

it works for me on a fresh arch linux install with package python-pytorch-opt-cuda 1.7.1-6 and cuda 11.2.r11.2.

@scottgigante : I suggest to update the system ?

b.

jeanfeydy commented 3 years ago

Hi @bcharlier!

On Colab, default CUDA version is now 11.0. Unfortunately, I have now tried KeOps on Colab with CUDA 11.0, 11.1 and 11.2, with failures in all three cases. gcc and g++ are both at version 7.5.

Out of the several combinations that I have tried, the most informative error message came from CUDA 11.2 with the KeOps master branch:

!apt-get install cuda=11.2.1-1 > log.txt
!git clone --recursive https://github.com/getkeops/keops.git keops
import os.path
import sys
sys.path.append('keops')

import pykeops
pykeops.clean_pykeops()
pykeops.test_torch_bindings() 
[pyKeOps] Initializing build folder for dtype=float32 and lang=torch in /root/.cache/pykeops-1.4.2-cpython-37 ... 
--------------------- CMAKE DEBUG -----------------
Command '['cmake', '/content/keops/pykeops/cmake_scripts/script_keops_formula/', "-DCMAKE_BUILD_TYPE='Release'", "-Dshared_obj_name='KeOps_formula'", "-DPYTHON_LANG='torch'", "-D__TYPE__='float'", '-DC_CONTIGUOUS=1', '-DPYTORCH_ROOT_DIR=/usr/local/lib/python3.7/dist-packages/torch', '-D_GLIBCXX_USE_CXX11_ABI=0', "-DcommandLine=cmake /content/keops/pykeops/cmake_scripts/script_keops_formula/ -DCMAKE_BUILD_TYPE='Release' -Dshared_obj_name='KeOps_formula' -DPYTHON_LANG='torch' -D__TYPE__='float' -DC_CONTIGUOUS=1 -DPYTORCH_ROOT_DIR=/usr/local/lib/python3.7/dist-packages/torch -D_GLIBCXX_USE_CXX11_ABI=0"]' returned non-zero exit status 1.
-- The CXX compiler identification is GNU 7.5.0
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for a CUDA compiler
-- Looking for a CUDA compiler - /usr/local/cuda/bin/nvcc
-- The CUDA Host CXX Compiler: /usr/bin/c++
-- The CUDA compiler identification is NVIDIA 11.2.142
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc -- works
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Compute properties automatically set to: -DMAXIDGPU=0;-DMAXTHREADSPERBLOCK0=1024;-DSHAREDMEMPERBLOCK0=49152
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Autodetected CUDA architecture(s):  7.5
-- Configuring incomplete, errors occurred!
See also "/root/.cache/pykeops-1.4.2-cpython-37/build-a8df550e6a/CMakeFiles/CMakeOutput.log".
See also "/root/.cache/pykeops-1.4.2-cpython-37/build-a8df550e6a/CMakeFiles/CMakeError.log".

--------------------- ----------- -----------------
done.
[pyKeOps] Compiling libKeOpstorch5962b0d17b in /root/.cache/pykeops-1.4.2-cpython-37:
       formula: Sum_Reduction(SqNorm2(x - y),1)
       aliases: x = Vi(0,3); y = Vj(1,3); 
       dtype  : float32
... 

--------------------- MAKE DEBUG -----------------
Command '['cmake', '--build', '.', '--target', 'KeOps_formula', '--', 'VERBOSE=1']' returned non-zero exit status 2.

--------------------- ----------- -----------------
[pyKeOps] Compiling pybind11 template libKeOps_template_09a892fa86 in /root/.cache/pykeops-1.4.2-cpython-37 ... [pyKeOps] Initializing build folder for dtype=float32 and lang=torch in /root/.cache/pykeops-1.4.2-cpython-37 ... 
--------------------- CMAKE DEBUG -----------------
Command '['cmake', '/content/keops/pykeops/cmake_scripts/script_keops_formula/', "-DCMAKE_BUILD_TYPE='Debug'", "-Dshared_obj_name='KeOps_formula'", "-DPYTHON_LANG='torch'", "-D__TYPE__='float'", '-DC_CONTIGUOUS=1', '-DPYTORCH_ROOT_DIR=/usr/local/lib/python3.7/dist-packages/torch', '-D_GLIBCXX_USE_CXX11_ABI=0', "-DcommandLine=cmake /content/keops/pykeops/cmake_scripts/script_keops_formula/ -DCMAKE_BUILD_TYPE='Debug' -Dshared_obj_name='KeOps_formula' -DPYTHON_LANG='torch' -D__TYPE__='float' -DC_CONTIGUOUS=1 -DPYTORCH_ROOT_DIR=/usr/local/lib/python3.7/dist-packages/torch -D_GLIBCXX_USE_CXX11_ABI=0"]' returned non-zero exit status 1.
-- The CXX compiler identification is GNU 7.5.0
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for a CUDA compiler
-- Looking for a CUDA compiler - /usr/local/cuda/bin/nvcc
-- The CUDA Host CXX Compiler: /usr/bin/c++
-- The CUDA compiler identification is NVIDIA 11.2.142
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc -- works
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Compute properties automatically set to: -DMAXIDGPU=0;-DMAXTHREADSPERBLOCK0=1024;-DSHAREDMEMPERBLOCK0=49152
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Autodetected CUDA architecture(s):  7.5
-- Configuring incomplete, errors occurred!
See also "/root/.cache/pykeops-1.4.2-cpython-37/build-b0d8d500b3/CMakeFiles/CMakeOutput.log".
See also "/root/.cache/pykeops-1.4.2-cpython-37/build-b0d8d500b3/CMakeFiles/CMakeError.log".

--------------------- ----------- -----------------
done.
[pyKeOps] Compiling libKeOpstorch5962b0d17b in /root/.cache/pykeops-1.4.2-cpython-37:
       formula: Sum_Reduction(SqNorm2(x - y),1)
       aliases: x = Vi(0,3); y = Vj(1,3); 
       dtype  : float32
... 

--------------------- MAKE DEBUG -----------------
Command '['cmake', '--build', '.', '--target', 'KeOps_formula', '--', 'VERBOSE=1']' returned non-zero exit status 2.

[standard error stack]

The CMakeError.log points to (yet another?) problem with the detection of pthreads:

Determining if the pthread_create exist failed with the following output:
Change Dir: /root/.cache/pykeops-1.4.2-cpython-37/build-b0d8d500b3/CMakeFiles/CMakeTmp

Run Build Command:"/usr/bin/make" "cmTC_5c860/fast"
/usr/bin/make -f CMakeFiles/cmTC_5c860.dir/build.make CMakeFiles/cmTC_5c860.dir/build
make[1]: Entering directory '/root/.cache/pykeops-1.4.2-cpython-37/build-b0d8d500b3/CMakeFiles/CMakeTmp'
Building CXX object CMakeFiles/cmTC_5c860.dir/CheckSymbolExists.cxx.o
/usr/bin/c++    -std=gnu++14 -o CMakeFiles/cmTC_5c860.dir/CheckSymbolExists.cxx.o -c /root/.cache/pykeops-1.4.2-cpython-37/build-b0d8d500b3/CMakeFiles/CMakeTmp/CheckSymbolExists.cxx
Linking CXX executable cmTC_5c860
/usr/local/lib/python2.7/dist-packages/cmake/data/bin/cmake -E cmake_link_script CMakeFiles/cmTC_5c860.dir/link.txt --verbose=1
/usr/bin/c++       CMakeFiles/cmTC_5c860.dir/CheckSymbolExists.cxx.o  -o cmTC_5c860 
CMakeFiles/cmTC_5c860.dir/CheckSymbolExists.cxx.o: In function `main':
CheckSymbolExists.cxx:(.text+0x1b): undefined reference to `pthread_create'
collect2: error: ld returned 1 exit status
CMakeFiles/cmTC_5c860.dir/build.make:86: recipe for target 'cmTC_5c860' failed
make[1]: *** [cmTC_5c860] Error 1
make[1]: Leaving directory '/root/.cache/pykeops-1.4.2-cpython-37/build-b0d8d500b3/CMakeFiles/CMakeTmp'
Makefile:121: recipe for target 'cmTC_5c860/fast' failed
make: *** [cmTC_5c860/fast] Error 2

File /root/.cache/pykeops-1.4.2-cpython-37/build-b0d8d500b3/CMakeFiles/CMakeTmp/CheckSymbolExists.cxx:
/* */
#include <pthread.h>

int main(int argc, char** argv)
{
  (void)argv;
#ifndef pthread_create
  return ((int*)(&pthread_create))[argc];
#else
  (void)argc;
  return 0;
#endif
}

Determining if the function pthread_create exists in the pthreads failed with the following output:
Change Dir: /root/.cache/pykeops-1.4.2-cpython-37/build-b0d8d500b3/CMakeFiles/CMakeTmp

Run Build Command:"/usr/bin/make" "cmTC_fd5e2/fast"
/usr/bin/make -f CMakeFiles/cmTC_fd5e2.dir/build.make CMakeFiles/cmTC_fd5e2.dir/build
make[1]: Entering directory '/root/.cache/pykeops-1.4.2-cpython-37/build-b0d8d500b3/CMakeFiles/CMakeTmp'
Building CXX object CMakeFiles/cmTC_fd5e2.dir/CheckFunctionExists.cxx.o
/usr/bin/c++    -DCHECK_FUNCTION_EXISTS=pthread_create   -std=gnu++14 -o CMakeFiles/cmTC_fd5e2.dir/CheckFunctionExists.cxx.o -c /root/.cache/pykeops-1.4.2-cpython-37/build-b0d8d500b3/CMakeFiles/CheckLibraryExists/CheckFunctionExists.cxx
Linking CXX executable cmTC_fd5e2
/usr/local/lib/python2.7/dist-packages/cmake/data/bin/cmake -E cmake_link_script CMakeFiles/cmTC_fd5e2.dir/link.txt --verbose=1
/usr/bin/c++   -DCHECK_FUNCTION_EXISTS=pthread_create    CMakeFiles/cmTC_fd5e2.dir/CheckFunctionExists.cxx.o  -o cmTC_fd5e2 -lpthreads 
/usr/bin/ld: cannot find -lpthreads
collect2: error: ld returned 1 exit status
CMakeFiles/cmTC_fd5e2.dir/build.make:86: recipe for target 'cmTC_fd5e2' failed
make[1]: *** [cmTC_fd5e2] Error 1
make[1]: Leaving directory '/root/.cache/pykeops-1.4.2-cpython-37/build-b0d8d500b3/CMakeFiles/CMakeTmp'
Makefile:121: recipe for target 'cmTC_fd5e2/fast' failed
make: *** [cmTC_fd5e2/fast] Error 2

Does it remind you of a previous issue?

Best regards, Jean

jeanfeydy commented 3 years ago

After some debugging work with @bcharlier, @joanglaunes and @gdurif, we have finally identified the problem:

/content/keops/pykeops/cmake_scripts/script_keops_formula/../../keops/core/formulas/maths/TensorDotNoTao.h(84): error: "auto" is not allowed here
/content/keops/pykeops/cmake_scripts/script_keops_formula/../../keops/core/formulas/maths/TensorDotNoTao.h(90): error: "auto" is not allowed here
2 errors detected in the compilation of "/content/keops/pykeops/keops/core/link_autodiff.cu".

Unfortunately, it seems that some of our advanced templating code to support the TensorDot (= "einsum") operation is not handled well by nvcc 11 with gcc/g++ < 10. We will either fix the TensorDot implementation quickly, or disable this (fairly niche) operation on configurations that do not provide the latest gcc/g++. In any case, the problem should be fixed soon!

bcharlier commented 3 years ago

@fradav should have an insight ;..

fradav commented 3 years ago

I got trouble to reproduce the issue. This minimal code :

#include <array>

template <auto& Arr, size_t... Is>
    static constexpr auto make_seq_impl(std::index_sequence<Is...>) {
        using T = typename std::decay_t<decltype(Arr)>::value_type;
        return std::integer_sequence<T, Arr[Is]...>{};
    }

    template <auto& Arr>
    static constexpr auto make_seq() {
        return make_seq_impl<Arr>(std::make_index_sequence<Arr.size()>());
    }

constexpr std::array a{7, 15, 28};

int main()
{
    auto x = make_seq<a>();
    static_assert(std::is_same<std::integer_sequence<int, 7, 15, 28>, decltype(x)>::value, "!");
}

works with every gcc version I tried, starting from 7.*

fradav commented 3 years ago

No workaround on colab for this like install https://anaconda.org/conda-forge/gcc_linux-64 ? (version 9) Option for nvcc is -ccbin=gcc-9 (perhaps CUDA_NVCC_FLAGS for environment)

fradav commented 3 years ago

What I'm suspecting there: c++17 support for gcc backends <= 7 has been droped by latest cuda. The fact that the error is error: "auto" is not allowed here is exactly what we got with gcc without -std=c++17 is a clue.

joanglaunes commented 3 years ago

Hello @fradav , The error is also present when using gcc 9. On my university server, with Ubuntu 20.04, nvcc 11.0 and gcc 9.3, I get the same error. I gave @bcharlier information to access the server, I can send you an email also if you want to try it out.

bcharlier commented 3 years ago

commit e19e2a1 fixes the problem on rosenblatt. Does it help on collab ?

jeanfeydy commented 3 years ago

Hi @bcharlier,

Unfortunately, not completely... On a fresh Colab session:

!git clone --recursive https://github.com/getkeops/keops.git keops
import os.path
import sys
sys.path.append('keops')

import pykeops
pykeops.clean_pykeops()
pykeops.test_numpy_bindings() 

Returns:

Cleaning /root/.cache/pykeops-1.4.2-cpython-37/...
[pyKeOps] Initializing build folder for dtype=float64 and lang=numpy in /root/.cache/pykeops-1.4.2-cpython-37 ... 
--------------------- CMAKE DEBUG -----------------
Command '['cmake', '/content/keops/pykeops/cmake_scripts/script_keops_formula/', "-DCMAKE_BUILD_TYPE='Release'", "-Dshared_obj_name='KeOps_formula'", "-DPYTHON_LANG='numpy'", "-D__TYPE__='double'", '-DC_CONTIGUOUS=1', "-DcommandLine=cmake /content/keops/pykeops/cmake_scripts/script_keops_formula/ -DCMAKE_BUILD_TYPE='Release' -Dshared_obj_name='KeOps_formula' -DPYTHON_LANG='numpy' -D__TYPE__='double' -DC_CONTIGUOUS=1"]' returned non-zero exit status 1.
-- The CXX compiler identification is GNU 7.5.0
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for a CUDA compiler
-- Looking for a CUDA compiler - /usr/local/cuda/bin/nvcc
-- The CUDA Host CXX Compiler: /usr/bin/c++
-- The CUDA compiler identification is NVIDIA 11.0.221
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc -- works
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Compute properties automatically set to: -DMAXIDGPU=0;-DMAXTHREADSPERBLOCK0=1024;-DSHAREDMEMPERBLOCK0=49152
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Autodetected CUDA architecture(s):  6.1
-- Configuring incomplete, errors occurred!
See also "/root/.cache/pykeops-1.4.2-cpython-37/build-cc04f2f358/CMakeFiles/CMakeOutput.log".
See also "/root/.cache/pykeops-1.4.2-cpython-37/build-cc04f2f358/CMakeFiles/CMakeError.log".

--------------------- ----------- -----------------
done.
[pyKeOps] Compiling libKeOpsnumpy69a6ce2347 in /root/.cache/pykeops-1.4.2-cpython-37:
       formula: Sum_Reduction(SqNorm2(x - y),1)
       aliases: x = Vi(0,3); y = Vj(1,3); 
       dtype  : float64
... 

--------------------- MAKE DEBUG -----------------
Command '['cmake', '--build', '.', '--target', 'KeOps_formula', '--', 'VERBOSE=1']' returned non-zero exit status 2.

--------------------- ----------- -----------------
[pyKeOps] Compiling pybind11 template libKeOps_template_5da46c0346 in /root/.cache/pykeops-1.4.2-cpython-37 ... [pyKeOps] Initializing build folder for dtype=float64 and lang=numpy in /root/.cache/pykeops-1.4.2-cpython-37 ... 
--------------------- CMAKE DEBUG -----------------
Command '['cmake', '/content/keops/pykeops/cmake_scripts/script_keops_formula/', "-DCMAKE_BUILD_TYPE='Debug'", "-Dshared_obj_name='KeOps_formula'", "-DPYTHON_LANG='numpy'", "-D__TYPE__='double'", '-DC_CONTIGUOUS=1', "-DcommandLine=cmake /content/keops/pykeops/cmake_scripts/script_keops_formula/ -DCMAKE_BUILD_TYPE='Debug' -Dshared_obj_name='KeOps_formula' -DPYTHON_LANG='numpy' -D__TYPE__='double' -DC_CONTIGUOUS=1"]' returned non-zero exit status 1.
-- The CXX compiler identification is GNU 7.5.0
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for a CUDA compiler
-- Looking for a CUDA compiler - /usr/local/cuda/bin/nvcc
-- The CUDA Host CXX Compiler: /usr/bin/c++
-- The CUDA compiler identification is NVIDIA 11.0.221
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc -- works
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Compute properties automatically set to: -DMAXIDGPU=0;-DMAXTHREADSPERBLOCK0=1024;-DSHAREDMEMPERBLOCK0=49152
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Autodetected CUDA architecture(s):  6.1
-- Configuring incomplete, errors occurred!
See also "/root/.cache/pykeops-1.4.2-cpython-37/build-a4989c469c/CMakeFiles/CMakeOutput.log".
See also "/root/.cache/pykeops-1.4.2-cpython-37/build-a4989c469c/CMakeFiles/CMakeError.log".

--------------------- ----------- -----------------
done.
[pyKeOps] Compiling libKeOpsnumpy69a6ce2347 in /root/.cache/pykeops-1.4.2-cpython-37:
       formula: Sum_Reduction(SqNorm2(x - y),1)
       aliases: x = Vi(0,3); y = Vj(1,3); 
       dtype  : float64
... 

--------------------- MAKE DEBUG -----------------
Command '['cmake', '--build', '.', '--target', 'KeOps_formula', '--', 'VERBOSE=1']' returned non-zero exit status 2.

--------------------- ----------- -----------------
[pyKeOps] Compiling pybind11 template libKeOps_template_72e45f70af in /root/.cache/pykeops-1.4.2-cpython-37 ... 
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
/content/keops/pykeops/test/install.py in test_numpy_bindings()
     21 
---> 22         my_conv = pknp.Genred(formula, var)
     23         if np.allclose(my_conv(x, y).flatten(), expected_res):

13 frames
/content/keops/pykeops/numpy/generic/generic_red.py in __init__(self, formula, aliases, reduction_op, axis, dtype, opt_arg, formula2, cuda_type, dtype_acc, use_double_acc, sum_scheme, enable_chunks, optional_flags, rec_multVar_highdim)
    182         self.myconv = LoadKeOps(
--> 183             self.formula, self.aliases, self.dtype, "numpy", self.optional_flags
    184         ).import_module()

/content/keops/pykeops/common/keops_io.py in __init__(self, formula, aliases, dtype, lang, optional_flags, include_dirs)
     47         ):
---> 48             self._safe_compile()
     49 

/content/keops/pykeops/common/utils.py in wrapper_filelock(*args, **kwargs)
     74                 with FileLock(f):
---> 75                     func_res = func(*args, **kwargs)
     76 

/content/keops/pykeops/common/keops_io.py in _safe_compile(self)
     62             self.include_dirs,
---> 63             self.build_folder,
     64         )

/content/keops/pykeops/common/compile_routines.py in compile_generic_routine(formula, aliases, dllname, dtype, lang, optional_flags, include_dirs, build_folder)
    245     template_name, is_rebuilt = get_or_build_pybind11_template(
--> 246         dtype, lang, include_dirs, use_prebuilt_formula=True
    247     )

/content/keops/pykeops/common/compile_routines.py in get_or_build_pybind11_template(dtype, lang, include_dirs, use_prebuilt_formula)
     87             pykeops.config.bin_folder + os.path.sep + fname,
---> 88             template_build_folder + os.path.sep + fname,
     89         )

FileNotFoundError: [Errno 2] No such file or directory: '/root/.cache/pykeops-1.4.2-cpython-37//KeOps_formula.o' -> '/root/.cache/pykeops-1.4.2-cpython-37//build-pybind11_template-libKeOps_template_5da46c0346/KeOps_formula.o'

During handling of the above exception, another exception occurred:

FileNotFoundError                         Traceback (most recent call last)
<ipython-input-2-cbb191019c92> in <module>()
      1 import pykeops
      2 pykeops.clean_pykeops()
----> 3 pykeops.test_numpy_bindings()

/content/keops/pykeops/test/install.py in test_numpy_bindings()
     33         pykeops.config.verbose = True
     34         pykeops.config.build_type = "Debug"
---> 35         my_conv = pknp.Genred(formula, var)
     36         print(my_conv(x, y))
     37 

/content/keops/pykeops/numpy/generic/generic_red.py in __init__(self, formula, aliases, reduction_op, axis, dtype, opt_arg, formula2, cuda_type, dtype_acc, use_double_acc, sum_scheme, enable_chunks, optional_flags, rec_multVar_highdim)
    181         self.dtype = dtype
    182         self.myconv = LoadKeOps(
--> 183             self.formula, self.aliases, self.dtype, "numpy", self.optional_flags
    184         ).import_module()
    185         self.axis = axis

/content/keops/pykeops/common/keops_io.py in __init__(self, formula, aliases, dtype, lang, optional_flags, include_dirs)
     46             pykeops.config.build_type == "Debug"
     47         ):
---> 48             self._safe_compile()
     49 
     50     @create_and_lock_build_folder()

/content/keops/pykeops/common/utils.py in wrapper_filelock(*args, **kwargs)
     73             with open(os.path.join(bf, "pykeops_build2.lock"), "w") as f:
     74                 with FileLock(f):
---> 75                     func_res = func(*args, **kwargs)
     76 
     77             # clean

/content/keops/pykeops/common/keops_io.py in _safe_compile(self)
     61             self.optional_flags,
     62             self.include_dirs,
---> 63             self.build_folder,
     64         )
     65 

/content/keops/pykeops/common/compile_routines.py in compile_generic_routine(formula, aliases, dllname, dtype, lang, optional_flags, include_dirs, build_folder)
    244 
    245     template_name, is_rebuilt = get_or_build_pybind11_template(
--> 246         dtype, lang, include_dirs, use_prebuilt_formula=True
    247     )
    248 

/content/keops/pykeops/common/compile_routines.py in get_or_build_pybind11_template(dtype, lang, include_dirs, use_prebuilt_formula)
     86         os.rename(
     87             pykeops.config.bin_folder + os.path.sep + fname,
---> 88             template_build_folder + os.path.sep + fname,
     89         )
     90 

FileNotFoundError: [Errno 2] No such file or directory: '/root/.cache/pykeops-1.4.2-cpython-37//KeOps_formula.o' -> '/root/.cache/pykeops-1.4.2-cpython-37//build-pybind11_template-libKeOps_template_72e45f70af/KeOps_formula.o'

Inspecting the log files:

!cat /root/.cache/pykeops-1.4.2-cpython-37/build-a4989c469c/CMakeFiles/CMakeError.log
Determining if the pthread_create exist failed with the following output:
Change Dir: /root/.cache/pykeops-1.4.2-cpython-37/build-a4989c469c/CMakeFiles/CMakeTmp

Run Build Command:"/usr/bin/make" "cmTC_169d9/fast"
/usr/bin/make -f CMakeFiles/cmTC_169d9.dir/build.make CMakeFiles/cmTC_169d9.dir/build
make[1]: Entering directory '/root/.cache/pykeops-1.4.2-cpython-37/build-a4989c469c/CMakeFiles/CMakeTmp'
Building CXX object CMakeFiles/cmTC_169d9.dir/CheckSymbolExists.cxx.o
/usr/bin/c++    -std=gnu++14 -o CMakeFiles/cmTC_169d9.dir/CheckSymbolExists.cxx.o -c /root/.cache/pykeops-1.4.2-cpython-37/build-a4989c469c/CMakeFiles/CMakeTmp/CheckSymbolExists.cxx
Linking CXX executable cmTC_169d9
/usr/local/lib/python2.7/dist-packages/cmake/data/bin/cmake -E cmake_link_script CMakeFiles/cmTC_169d9.dir/link.txt --verbose=1
/usr/bin/c++       CMakeFiles/cmTC_169d9.dir/CheckSymbolExists.cxx.o  -o cmTC_169d9 
CMakeFiles/cmTC_169d9.dir/CheckSymbolExists.cxx.o: In function `main':
CheckSymbolExists.cxx:(.text+0x1b): undefined reference to `pthread_create'
collect2: error: ld returned 1 exit status
CMakeFiles/cmTC_169d9.dir/build.make:86: recipe for target 'cmTC_169d9' failed
make[1]: *** [cmTC_169d9] Error 1
make[1]: Leaving directory '/root/.cache/pykeops-1.4.2-cpython-37/build-a4989c469c/CMakeFiles/CMakeTmp'
Makefile:121: recipe for target 'cmTC_169d9/fast' failed
make: *** [cmTC_169d9/fast] Error 2

File /root/.cache/pykeops-1.4.2-cpython-37/build-a4989c469c/CMakeFiles/CMakeTmp/CheckSymbolExists.cxx:
/* */
#include <pthread.h>

int main(int argc, char** argv)
{
  (void)argv;
#ifndef pthread_create
  return ((int*)(&pthread_create))[argc];
#else
  (void)argc;
  return 0;
#endif
}

Determining if the function pthread_create exists in the pthreads failed with the following output:
Change Dir: /root/.cache/pykeops-1.4.2-cpython-37/build-a4989c469c/CMakeFiles/CMakeTmp

Run Build Command:"/usr/bin/make" "cmTC_115c9/fast"
/usr/bin/make -f CMakeFiles/cmTC_115c9.dir/build.make CMakeFiles/cmTC_115c9.dir/build
make[1]: Entering directory '/root/.cache/pykeops-1.4.2-cpython-37/build-a4989c469c/CMakeFiles/CMakeTmp'
Building CXX object CMakeFiles/cmTC_115c9.dir/CheckFunctionExists.cxx.o
/usr/bin/c++    -DCHECK_FUNCTION_EXISTS=pthread_create   -std=gnu++14 -o CMakeFiles/cmTC_115c9.dir/CheckFunctionExists.cxx.o -c /root/.cache/pykeops-1.4.2-cpython-37/build-a4989c469c/CMakeFiles/CheckLibraryExists/CheckFunctionExists.cxx
Linking CXX executable cmTC_115c9
/usr/local/lib/python2.7/dist-packages/cmake/data/bin/cmake -E cmake_link_script CMakeFiles/cmTC_115c9.dir/link.txt --verbose=1
/usr/bin/c++   -DCHECK_FUNCTION_EXISTS=pthread_create    CMakeFiles/cmTC_115c9.dir/CheckFunctionExists.cxx.o  -o cmTC_115c9 -lpthreads 
/usr/bin/ld: cannot find -lpthreads
collect2: error: ld returned 1 exit status
CMakeFiles/cmTC_115c9.dir/build.make:86: recipe for target 'cmTC_115c9' failed
make[1]: *** [cmTC_115c9] Error 1
make[1]: Leaving directory '/root/.cache/pykeops-1.4.2-cpython-37/build-a4989c469c/CMakeFiles/CMakeTmp'
Makefile:121: recipe for target 'cmTC_115c9/fast' failed
make: *** [cmTC_115c9/fast] Error 2

What do you think?

bcharlier commented 3 years ago

ok, not sure that this error comes from keops... Pthreads is not found on the ubuntu image instanced by colab. A similar problem has been reported in an other project as described in this issue https://github.com/Nuand/bladeRF/issues/554 ...

jeanfeydy commented 3 years ago

Hi @bcharlier ,

Are you sure about this?

According to the CMake output, it actually ends up finding pthread_create after two failures. As discussed here, I wonder if this pthreads issue is just a “false positive” that is leading us away from the real issue. Since KeOps works on Colab with CUDA 10.2, this should not be an insurmountable problem - and this is definitely a top priority bug. I have tried to look for a more explicit error message hidden somewhere, but I don’t have much experience with CMake and am probably missing something or looking at the wrong files. If you have time to try, launching a Colab session is really easy and I’m sure that you’ll have more success! What do you think?

Best regards, Jean

joanglaunes commented 3 years ago

Hello, I just found that upgrading Cmake to Cmake 3.18 solves the problem in Colab : !pip install cmake==3.18 Maybe another version between 3.12 and 3.18 or above 3.18 works also, I did not check. So it seems to be a Cmake bug... and I guess it will be enough to add cmake version >= 3.18 as a requirement for the pip package ?

joanglaunes commented 3 years ago

I'm just realizing that we are already supposed to require Cmake 3.18 for KeOps, it comes from a previous issue we had in November (issue #122). And looking at this previous issue shows the same bunch of errors that we were experiencing these days... (with the "auto" problem for example). So it seems that the cmake scripts do not take into account this minimal CMake version requirement anymore (it is at line 107 of file cuda.cmake)

bcharlier commented 3 years ago

I suggest to put cmake in "extra_requires" field and not "extra_requires" as it agressively install cmake via pip on system already having a working cmake 3.18. Done in c3cbc1e8

on colab it gives;

!pip install pykeops[colab]
jeanfeydy commented 3 years ago

Hi @bcharlier , Perfect! Just let me know once you have pushed the new pykeops package on PyPi (v1.5?). I'll update the install instruction for Colab, re-render the doc and close the issue.

jeanfeydy commented 3 years ago

The v1.5 is now online on PyPi (thanks @bcharlier !). As detailed in the doc, the simplest way of using KeOps on Colab is now to install it with:

!pip install pykeops[colab]

This ensures that an up-to-date version of CMake is available. I'm closing the issue now - but feel free to re-open it if needed, especially @scottgigante if the problem remains on your configuration. Best regards, Jean

kheyer commented 2 years ago

Just ran into this issue so I thought I'd add to it.

c++ version 9.3.0 nvcc version 11.1

The problem was fixed by upgrading cmake from 3.16 to 3.18.

HOWEVER I was running in circles because for some reason upgrading cmake didn't actually upgrade cmake at first. conda list showed cmake 3.18, but cmake --version showed 3.16 still. Checking which cmake returned the correct filepath (conda install for me).

I don't know if restarting the terminal session fixed it, but for a while I was in a state where conda showed cmake 3.18 installed, but cmake --version was showing 3.16. If you run into this and aren't able to fix it by upgrading cmake, make sure the cmake upgrade actually propagates.