Qiskit / qiskit-aer

Aer is a high performance simulator for quantum circuits that includes noise models
https://qiskit.github.io/qiskit-aer/
Apache License 2.0
493 stars 361 forks source link

Aer fail on GPU build - libpthread(s) #723

Closed jwoehr closed 4 years ago

jwoehr commented 4 years ago

Informations

What is the current behavior?

python setup.py install -- -DAER_THRUST_BACKEND=CUDA -- -j8

fails with this message:

ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/home/jax/work/QISKit/qiskit_dev_venv4/lib/python3.6/site-packages/skbuild/setuptools_wrap.py", line 577, in setup
    cmkr.make(make_args, env=env)
  File "/home/jax/work/QISKit/qiskit_dev_venv4/lib/python3.6/site-packages/skbuild/cmaker.py", line 482, in make
    os.path.abspath(CMAKE_BUILD_DIR())))

An error occurred while building with CMake.
  Command:
    "/home/jax/work/QISKit/qiskit_dev_venv4/lib/python3.6/site-packages/cmake/data/bin/cmake" "--build" "." "--target" "install" "--config" "Release" "--" "-j8"
  Source directory:
    /home/jax/work/QISKit/DEV/qiskit-aer
  Working directory:
    /home/jax/work/QISKit/DEV/qiskit-aer/_skbuild/linux-x86_64-3.6/cmake-build
Please see CMake's output for more information.
Makefile:22: recipe for target 'aer' failed
make: *** [aer] Error 1

qiskit-aer/_skbuild/linux-x86_64-3.6/cmake-build/CMakeFiles/CMakeError.log shows the following:

Performing C SOURCE FILE Test CMAKE_HAVE_LIBC_PTHREAD failed with the following output:
Change Dir: /home/jax/work/QISKit/DEV/qiskit-aer/_skbuild/linux-x86_64-3.6/cmake-build/CMakeFiles/CMakeTmp

Run Build Command(s):/usr/bin/ninja cmTC_525f8 && [1/2] Building C object CMakeFiles/cmTC_525f8.dir/src.c.o
[2/2] Linking C executable cmTC_525f8
FAILED: cmTC_525f8 
: && /usr/bin/cc -DCMAKE_HAVE_LIBC_PTHREAD   CMakeFiles/cmTC_525f8.dir/src.c.o  -o cmTC_525f8   && :
CMakeFiles/cmTC_525f8.dir/src.c.o: In function `main':
src.c:(.text+0x3e): undefined reference to `pthread_create'
src.c:(.text+0x4a): undefined reference to `pthread_detach'
src.c:(.text+0x5b): undefined reference to `pthread_join'
src.c:(.text+0x6f): undefined reference to `pthread_atfork'
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.

Source file was:
#include <pthread.h>

void* test_func(void* data)
{
  return data;
}

int main(void)
{
  pthread_t thread;
  pthread_create(&thread, NULL, test_func, NULL);
  pthread_detach(thread);
  pthread_join(thread, NULL);
  pthread_atfork(NULL, NULL, NULL);
  pthread_exit(NULL);

  return 0;
}

I am able to build this same failing configuration test program compiles and links with the command line:

gcc -Wall -o test_pthread test_pthread.c -lpthread

With a soft link libpthread.so as libpthreads.so the test program also compiles with the command line:

gcc -Wall -o test_pthread test_pthread.c -lpthreads

which is the lib name that the Aer build may be expecting.

Steps to reproduce the problem

Build Aer with the command line python setup.py install -- -DAER_THRUST_BACKEND=CUDA -- -j8

What is the expected behavior?

Aer builds

Suggested solutions

Modify cmake configuration files:

atilag commented 4 years ago

CMake is a kind of complex beast and understanding its error messages requires some previous experience with the tool otherwise is pretty easy to lose the track of what's going on. Everything should be working fine by following the instructions from the contributing guide, no extra steps are needed. One thing that is not explicitly said in the guide is that on Linux and Mac the preferred tool to build is Make, not ninja, it doesn't mean that with ninja things are not going to build, it's just that we haven't tested it, so I'd recommend removing ninja if you are not using it for any other purpose on your system. Try these steps and show me the build.log:

$ cd qiskit-aer
$ rm -rf _skbuild
$ VERBOSE=1 python ./setup.py bdist_wheel -- -DAER_THRUST_BACKEND=CUDA -- -j 2>&1|tee build.log
jwoehr commented 4 years ago

Hi @atilag ... the tee output is attached, along with CMakeError.log .. as noted, it's bombing looking for libpthreads which, even if it exists (creating it by soft link from libpthread, which works for linking the configure test manuall), doesn't satisfy the links because, I think, Ninja is not writing the link line correctly.

Ninja is there because so many Qiskit requirements won't build without it :)

If this is too much trouble, I supposed I can remove it.

build.log CMakeError.log

jwoehr commented 4 years ago

BTW, Aer builds fine if I don't try to build the GPU/Cuda version.

atilag commented 4 years ago

Ok, there seems to be two different problems there ,but I think they could be related somehow. Let's fix the first one:

ninja: fatal: invalid -j parameter

I told you to use -j because this is the parameter one passes to Make for parallel compilation, but ninja doesn't understand this. If you don't want to remove ninja, then we have to tell the build system to use Make so:

$ VERBOSE=1 python ./setup.py bdist_wheel -- -G "Unix Makefiles" --DAER_THRUST_BACKEND=CUDA -- -j 2>&1|tee build.log
jwoehr commented 4 years ago

Same or similar problem:

$ less /home/jax/work/QISKit/DEV/qiskit-aer/_skbuild/linux-x86_64-3.6/cmake-build/CMakeFiles/CMakeError.log
Performing C SOURCE FILE Test CMAKE_HAVE_LIBC_PTHREAD failed with the following output:
Change Dir: /home/jax/work/QISKit/DEV/qiskit-aer/_skbuild/linux-x86_64-3.6/cmake-build/CMakeFiles/CMakeTmp

Run Build Command(s):/usr/bin/make cmTC_6a46f/fast && make[1]: Entering directory '/home/jax/work/QISKit/DEV/qiskit-aer/_skbuild/linux-x86_64-3.6/cmake-build/CMakeFiles/CMakeTmp'
/usr/bin/make -f CMakeFiles/cmTC_6a46f.dir/build.make CMakeFiles/cmTC_6a46f.dir/build
make[2]: Entering directory '/home/jax/work/QISKit/DEV/qiskit-aer/_skbuild/linux-x86_64-3.6/cmake-build/CMakeFiles/CMakeTmp'
Building C object CMakeFiles/cmTC_6a46f.dir/src.c.o
/usr/bin/cc   -DCMAKE_HAVE_LIBC_PTHREAD   -o CMakeFiles/cmTC_6a46f.dir/src.c.o   -c /home/jax/work/QISKit/DEV/qiskit-aer/_skbuild/linux-x86_64-3.6/cmake-build/CMakeFiles/CMakeTmp/src.c
Linking C executable cmTC_6a46f
/home/jax/work/QISKit/qiskit_dev_venv4/lib/python3.6/site-packages/cmake/data/bin/cmake -E cmake_link_script CMakeFiles/cmTC_6a46f.dir/link.txt --verbose=1
/usr/bin/cc  -DCMAKE_HAVE_LIBC_PTHREAD     CMakeFiles/cmTC_6a46f.dir/src.c.o  -o cmTC_6a46f 
CMakeFiles/cmTC_6a46f.dir/src.c.o: In function `main':
src.c:(.text+0x3e): undefined reference to `pthread_create'
src.c:(.text+0x4a): undefined reference to `pthread_detach'
src.c:(.text+0x5b): undefined reference to `pthread_join'
src.c:(.text+0x6f): undefined reference to `pthread_atfork'
collect2: error: ld returned 1 exit status
CMakeFiles/cmTC_6a46f.dir/build.make:86: recipe for target 'cmTC_6a46f' failed
make[2]: *** [cmTC_6a46f] Error 1
make[2]: Leaving directory '/home/jax/work/QISKit/DEV/qiskit-aer/_skbuild/linux-x86_64-3.6/cmake-build/CMakeFiles/CMakeTmp'
Makefile:121: recipe for target 'cmTC_6a46f/fast' failed
make[1]: *** [cmTC_6a46f/fast] Error 2
make[1]: Leaving directory '/home/jax/work/QISKit/DEV/qiskit-aer/_skbuild/linux-x86_64-3.6/cmake-build/CMakeFiles/CMakeTmp'

Source file was:
#include <pthread.h>

void* test_func(void* data)
{
  return data;
}

int main(void)
{
  pthread_t thread;
  pthread_create(&thread, NULL, test_func, NULL);
  pthread_detach(thread);
  pthread_join(thread, NULL);
  pthread_atfork(NULL, NULL, NULL);
  pthread_exit(NULL);

  return 0;
}
atilag commented 4 years ago

What about the build.log?

jwoehr commented 4 years ago

build.log

vvilpas commented 4 years ago

Hi @jwoehr Looking in the last build.log you provided, I see this error:

nvcc fatal   : Unsupported gpu architecture 'compute_20' 

At the beginning there are these lines related to CUDA conf:

-- Found CUDA: /usr (found version "9.1") 
-- Thrust library: CUDA found!
-- Autodetected CUDA architecture(s):  2.1(2.0)

Does your CUDA device belong to this architecture group (2.0)?

jwoehr commented 4 years ago

@vvilpas :

$ nvidia-smi
Thu Apr 30 05:55:48 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.116                Driver Version: 390.116                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 540M     Off  | 00000000:01:00.0 N/A |                  N/A |
| N/A   49C    P0    N/A /  N/A |    361MiB /   963MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0                    Not Supported                                       |
+-----------------------------------------------------------------------------+
vvilpas commented 4 years ago

GeForce GT 540M's CUDA architecture is 2.1 (you can check here, https://developer.nvidia.com/cuda-gpus). Unfortunately, that one is deprecated since CUDA 9. Not sure if downgrading to CUDA 8 (in case you want) will work. Maybe @hhorii can shed some light about this.

atilag commented 4 years ago

@jwoehr @vvilpas I don't think CUDA 8 will work, because we require a modern version of the compiler to run (due to C++11 stuff). I'm closing this issue @jwoehr feel free to repoen.