Closed Poofee closed 1 year ago
Check $CUDA_HOME/include/cuda_runtime_api.h exists and you have permission to access it (e.g. head $CUDA_HOME/include/cuda_runtime_api.h
. If it does, check config.log and search for "cuda_runtime_api.h" and see if there's some unexpected nvcc error.
If you aren't able to figure out what's going wrong from that, please attach config.log here so the devs can take a look.
$CUDA_HOME/include/cuda_runtime_api.h does exist. I have set the CUDA_HOME in both files /etc/profile and ~/.bashrc as follow:
CUDA_HOME=/usr/local/cuda
export CUDA_HOME
PATH=$PATH:$CUDA_HOME/bin
export PATH
I can use nvcc in my terminal. And I always use sudo to run the configure.
I am sorry. Because I am in China, Github cannot be accessed sometimes. I failed upload the file. Would you mind taking a look at here? Thank you! https://gist.github.com/poofee/54e53923df63aaafab6ffb621174e9be
May I ask where is the file "conftest.c"? What's it? Now I have run out of every method I know. I log in the root account. And I run the script in the terminal. CUDA_HOME=/usr/local/cuda export CUDA_HOME PATH=$PATH:$CUDA_HOME/bin export PATH But it still tell me cannot find the NVCC include path? What's wrong? It's so wired.
Finally, I have solved the problem. Since the configure process fails. So I changed the configure file. I added the line: spral_nvcc_inc_ok = yes so configure will be ok. But when I type make, the error still happens. so I try to edit the makefile. I find the file src/hw_topology/guess_topology.cxx using
So I run the command: echo | gcc -v -x c++ -E The path of cuda include isn't there. So I guess it is the key. In the command line, I type in: CPLUS_INCLUDE_PATH=/usr/local/cuda/include export CPLUS_INCLUDE_PATH And then when I redo the make, no error found. o my god! So my question is: Is it a bug or everyone should add the cuda include path in the cpp default path?
Please recheck file spral/m4/spral_nvcc_lib.m4 , line 23:
save_CPPFLAGS="$CCPFLAGS"; CPPFLAGS="$CPPFLAGS $NVCC_INCLUDE_FLAGS"
Although I don't know much about configure, I think there is a mistake.
Hi poofee,
Please, can you try configuring as described below (sudo
shouldn't be needed) in a clean state of the local repository:
./autogen.sh
./configure --with-metis="-L/usr/local/lib -lmetis" --with-blas="-L/usr/OpenBLAS -lopenblas"
That is, without specifying NVCC_INCLUDE_FLAGS
at all.
First, please try to unset CPLUS_INCLUDE_PATH
and run the configure
as above.
If you still get errors, can you please set CPLUS_INCLUDE_PATH
as you have described (to /usr/local/cuda/include
) and then re-run configure
as above.
Please let us know which option (with or without CPLUS_INCLUDE_PATH
set) works for you - if any.
If there are errors in both cases, the corresponding config.log
files would be much appreciated.
When I cloned the code, I have built as what you said. Everything went OK. But when I tried to use CUDA, the problems occur.
I mean, when I use this:
./autogen.sh ./configure --with-metis="-L/usr/local/lib -lmetis" --with-blas="-L/usr/OpenBLAS -lopenblas"
I can build the code successfully. However, the configure still cannot find the nvcc, this version doesn't use CUDA. Even if I can use nvcc command in my terminal.
To be exact, the CPLUS_INCLUDE_PATH doesn't work no matter with or without. The C_INCLUDE_PATH works.
Thank you for the check!
Can you try the following:
Edit Makefile.am
, and uncomment the line 8 (remove the #
sign before NVCCFLAGS
).
Depending on the GPU you have in your machine, set the appropriate -arch=sm_...
flag in that line (roughly, sm_20
for Fermis, sm_30
or sm_35
or sm_37
for general Keplers, K40s, and K80s, respectively, and so on).
Re-run configure
, this time with explicitly setting NVCC
:
./autogen.sh
NVCC=nvcc ./configure --with-metis="-L/usr/local/lib -lmetis" --with-blas="-L/usr/OpenBLAS -lopenblas"
Do you get CUDA recognized and working now?
No. It still tells me NVCC include path not found unless I set the C_INCLUDE_PATH at first. By the way, as I have mentioned, in file spral/m4/spral_nvcc_lib.m4 , line 23: save_CPPFLAGS="$CCPFLAGS"; CPPFLAGS="$CPPFLAGS $NVCC_INCLUDE_FLAGS" Isn't the CCPFLAGS a spell mistake?
Yes, it looks like a typo, thank you! The maintainers, could you please fix this?
If you change CCPFLAGS
to CPPFLAGS
and reconfigure, do you get CUDA working?
If so, can you unset C_INCLUDE_FLAGS
and still get everything configured and built?
No. I guess there is something wrong with the configure file. Thank you for your time. Since I don't know much about the configure file and I have successfully built the library using C_INCLUDE_FLAGS, I could skip the problem. Thank you! I will keep watching what's going on.
OK, I'll post here if I figure out what is happening.
My environment has C_INCLUDE_FLAGS
set by the modules system, so I'll try to unset it and configure without it.
I can now confirm, that even with the typo fixed, configure
fails when C_INCLUDE_FLAGS
in not set to a directory containing CUDA includes:
configure: error: NVCC include path not found
Thank you again for reporting this!
It's good to confirm the problem. I really learn a lot.
Hi guys, I am trying to package spral into Gentoo ebuild system, but having this CUDA related issue.
It is failing in configure part, same message as here discussed. My Gentoo ebuild is quite simple:
src_prepare() {
default
WANT_AUTOCONF=2.5 eautoreconf
WANT_AUTOMAKE=1.9 eautomake
}
src_configure() {
local myeconfargs=(
BLAS_LIBS=$(pkg-config --libs-only-l blas)
LAPACK_LIBS=$(pkg-config --libs-only-l lapack)
C_INCLUDE_FLAGS="/opt/cuda/include"
NVCC_INCLUDE_FLAGS="/opt/cuda/include"
)
econf "${myeconfargs[@]}"
}
It fails on the part src_configure, any suggestion what I am doing wrong here? Thanks a lot.
My generated configure command looks like:
/configure --prefix=/usr --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share --sysconfdir=/etc --localstatedir=/var/lib --disable-dependency-tracking --disable-silent-rules --docdir=/usr/share/doc/spral-9999 --htmldir=/usr/share/doc/spral-9999/html --libdir=/usr/lib64 BLAS_LIBS=-lf77blas LAPACK_LIBS=-lreflapack -lf77blas C_INCLUDE_FLAGS=/opt/cuda/include NVCC_INCLUDE_FLAGS=/opt/cuda/include
So it is already too late, I understand that this should be set before ./configure execution.
Where exactly I should set this variable = in Makefile.in?
@archenroot:
Are you using the latest sources from master?
If so, can you try not setting C_INCLUDE_FLAGS
and NVCC_INCLUDE_FLAGS
in your ebuild (i.e., not put them in myeconfargs
)?
What do you get?
Thanks for quick response, so if I go just by this:
src_prepare() {
default
WANT_AUTOCONF=2.5 eautoreconf
WANT_AUTOMAKE=1.9 eautomake
}
src_configure() {
local myeconfargs=(
BLAS_LIBS=$(pkg-config --libs-only-l blas)
LAPACK_LIBS=$(pkg-config --libs-only-l lapack)
)
econf "${myeconfargs[@]}"
}
src_install() {
emake
}
So this generates following ./configure command:
./configure --prefix=/usr --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share --sysconfdir=/etc --localstatedir=/var/lib --disable-dependency-tracking --disable-silent-rules --docdir=/usr/share/doc/spral-9999 --htmldir=/usr/share/doc/spral-9999/html --libdir=/usr/lib64 BLAS_LIBS=-lf77blas LAPACK_LIBS=-lreflapack -lf77blas
I still have issue with NVCC, the nvcc itself is found as first lines, but at the end the INCLUDE detect mechanism doesn't work well:
checking for nvcc... nvcc
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
cuInit: 999
checking build system type... x86_64-pc-linux-gnu
checking host system type... x86_64-pc-linux-gnu
checking how to get verbose linking output from x86_64-pc-linux-gnu-gfortran... -v
checking for Fortran libraries of x86_64-pc-linux-gnu-gfortran... -L/usr/lib/gcc/x86_64-pc-linux-gnu/5.4.0 -L/usr/lib/gcc/x86_64-pc-linux-gnu/5.4.0/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-pc-linux-gnu/5.4.0/../../../../x86_64-pc-linux-gnu/lib -L/usr/lib/gcc/x86_64-pc-linux-gnu/5.4.0/../../.. -lgfortran -lm -lquadmath
checking flags to link C main with x86_64-pc-linux-gnu-gfortran... none
checking for std::align... no
checking for sched_getcpu()... yes
checking how to get verbose linking output from x86_64-pc-linux-gnu-gfortran... -v
checking for Fortran 77 libraries of x86_64-pc-linux-gnu-gfortran... -L/usr/lib/gcc/x86_64-pc-linux-gnu/5.4.0 -L/usr/lib/gcc/x86_64-pc-linux-gnu/5.4.0/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-pc-linux-gnu/5.4.0/../../../../x86_64-pc-linux-gnu/lib -L/usr/lib/gcc/x86_64-pc-linux-gnu/5.4.0/../../.. -lgfortran -lm -lquadmath
checking for dummy main to link with Fortran 77 libraries... none
checking for Fortran 77 name-mangling scheme... lower case, underscore, no extra underscore
checking for sgemm_ in -lf77blas ... yes
checking for cheev_ in -lreflapack -lf77blas ... yes
checking for METIS library... checking for metis_nodend_ in -L -lmetis... no
checking for metis_nodend_ in -lmetis... yes
checking version of METIS... "version 4"
checking for x86_64-pc-linux-gnu-pkg-config... /usr/bin/x86_64-pc-linux-gnu-pkg-config
checking pkg-config is at least version 0.9.0... yes
checking for HWLOC... no
configure: WARNING: hwloc not supplied: cannot detect NUMA regions
checking how to run the C preprocessor... x86_64-pc-linux-gnu-gcc -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking cuda_runtime_api.h usability... no
checking cuda_runtime_api.h presence... no
checking for cuda_runtime_api.h... no
checking for cuda_runtime_api.h... (cached) no
configure: error: NVCC include path not found
You have a very interesting CUDA environment, it seems.
If you look at line 3 of the configure
's output, it says:
cuInit: 999
That output in turn must have been produced by nvcc_arch_sm.c
program.
If you look there, and here:
http://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__INITIALIZE.html#group__CUDA__INITIALIZE
you will see that having such a return value (999) from cuInit
function is at least not documented, if not impossible.
So, something strange is going on here. Can you compile and run CUDA samples, e.g., on the machine you're testing this procedure?
:-) Yes, good catch, I am on following versions:
* x11-drivers/nvidia-drivers
Latest version available: 381.22
Latest version installed: 381.22
* dev-util/nvidia-cuda-toolkit
Latest version available: 8.0.61
Latest version installed: 8.0.61
* dev-util/nvidia-cuda-sdk
Latest version available: 8.0.61
Latest version installed: 8.0.61
But when I try to compile following piece of code:
#include <stdio.h>
#include <dlfcn.h>
int main() {
void *cudalib = dlopen("libcuda.so", RTLD_NOW);
int (*__cuInit)(unsigned int) = (int(*)(unsigned int)) dlsym( cudalib, "cuInit" );
int retval = (*__cuInit)(0);
printf("%d", retval);
}
I get 0, looks to me like cude works fine.
But I will try to examine, true is that in Gentoo you can switch which GPU chip is used for OpenGL, it was switched to run on intel graphics, now I switched it to Nvidia and I got additional message:
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
* ACCESS DENIED: open_wr: /dev/nvidia-uvm
* ACCESS DENIED: open_wr: /dev/nvidia-uvm
Normally OpenGL could be switched to Intel cpu gpu and I can still use CUDA externally...
I will re-install/compile those 3 packages again and give it another try...
Ok, I shouldn't be doing the openGL target switch, which somehow harmed my nvidia device as ACCESS DENIED to /dev/nvidia-uvm.
Anyway I reinstalled the packages, the GCC itself for sure (I have multi GCC experimental machine), in this case I work with 5.4.0-r3.
Now I executed following more advanced Hello world for CUDA:
#include <stdio.h>
#define cudaCheckErrors(msg) \
do { \
cudaError_t __err = cudaGetLastError(); \
if (__err != cudaSuccess) { \
fprintf(stderr, "Fatal error: %s (%s at %s:%d)\n", \
msg, cudaGetErrorString(__err), \
__FILE__, __LINE__); \
fprintf(stderr, "*** FAILED - ABORTING\n"); \
exit(1); \
} \
} while (0)
const int N = 16;
const int blocksize = 16;
__global__
void hello(char *a, int *b)
{
a[threadIdx.x] += b[threadIdx.x];
}
int main()
{
char a[N] = "Hello \0\0\0\0\0\0";
int b[N] = {15, 10, 6, 0, -11, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
char *ad;
int *bd;
const int csize = N*sizeof(char);
const int isize = N*sizeof(int);
printf("%s", a);
cudaMalloc( (void**)&ad, csize );
cudaMalloc( (void**)&bd, isize );
cudaCheckErrors("cudaMalloc fail");
cudaMemcpy( ad, a, csize, cudaMemcpyHostToDevice );
cudaMemcpy( bd, b, isize, cudaMemcpyHostToDevice );
cudaCheckErrors("cudaMemcpy H2D fail");
dim3 dimBlock( blocksize, 1 );
dim3 dimGrid( 1, 1 );
hello<<<dimGrid, dimBlock>>>(ad, bd);
cudaCheckErrors("Kernel fail");
cudaMemcpy( a, ad, csize, cudaMemcpyDeviceToHost );
cudaCheckErrors("cudaMemcpy D2H/Kernel fail");
cudaFree( ad );
cudaFree( bd );
printf("%s\n", a);
return EXIT_SUCCESS;
}
I get following:
zangetsu@ares ~ $ ./hello_world_cuda
Hello World!
zangetsu@ares ~ $ cuda-memcheck ./hello_world_cuda
========= CUDA-MEMCHECK
Hello World!
========= ERROR SUMMARY: 0 errors
I additionally tried to query the device with toolkit query utility:
zangetsu@ares ~ $ /opt/cuda/sdk/bin/x86_64/linux/release/deviceQueryDrv
/opt/cuda/sdk/bin/x86_64/linux/release/deviceQueryDrv Starting...
CUDA Device Query (Driver API) statically linked version
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 960M"
CUDA Driver Version: 8.0
CUDA Capability Major/Minor version number: 5.0
Total amount of global memory: 2003 MBytes (2100232192 bytes)
( 5) Multiprocessors, (128) CUDA Cores/MP: 640 CUDA Cores
GPU Max Clock rate: 1098 MHz (1.10 GHz)
Memory Clock rate: 2505 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 2097152 bytes
Max Texture Dimension Sizes 1D=(65536) 2D=(65536, 65536) 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Texture alignment: 512 bytes
Maximum memory pitch: 2147483647 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
Result = PASS
So seems like CUDA works for me. Still when I tried to recompile now the package I get the same error. Also note I have multiple packages installed with CUDA support enable which work...
I don't know what you are trying to achieve with this, but your code cannot work as you've said:
char a[N] = "Hello \0\0\0\0\0\0";
int b[N] = {15, 10, 6, 0, -11, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
In the kernel, you are adding to a
elements of b
, in the same order.
So, you would add 15 to 'H'.
How do you expect to have 'H' in the output?
I don't have to compile the code to see that.
This is just dummy CUDA test, and the output is:
zangetsu@ares ~ $ ./hello_world_cuda
Hello World!
zangetsu@ares ~ $ cuda-memcheck ./hello_world_cuda
========= CUDA-MEMCHECK
Hello World!
========= ERROR SUMMARY: 0 errors
The code works, you should compile it :dagger:
Any idea? CUDA works on the machine... thanks for any hint...
Closing as outdated, please open a new issue if you still have problems.
On my Ubuntu16.04 system, when I run
The configure outputs "error NVCC include path not found". I am sure I have set the correct CUDA_HOME. I don't know much about the configure and makefile, so how can I fix it? Thank you!