Closed maxwellmb closed 11 months ago
What cuda version do you have?
On Tue, Nov 21, 2023 at 5:22 PM Max Millar-Blanchaer < @.***> wrote:
Hi Lianqi,
When following the instructions to compile the code, I get a series of errors that seem to be related to the gpu.
The first is:
/usr/bin/ld: /usr/bin/ld: /usr/bin/ld: /home/maxmb/Library/maos_compiled/cuda/sim/.libs/libcusim.a(cudata.o): in function
gpu_dbg': /home/maxmb/Library/maos/cuda/sim/cudata.cu:53: undefined reference to
cudaGetDeviceProperties_v2'and then there are 10 or slight more with a similar pattern.
This is based on the most recent commit in the main branch.
Any advice would be much appreciated! I'm not sure if it's my computer or the something in the cudata.cu file.
Best wishes, Max
— Reply to this email directly, view it on GitHub https://github.com/lianqiw/maos/issues/11, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABGVABESP3LYQTJRBXELY3YFVHUNAVCNFSM6AAAAAA7VONED2VHI2DSMVQWIX3LMV43ASLTON2WKOZSGAYDKMZZG4ZTSMA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
According to nvidia-smi I have CUDA version 12.2.
On Tue, Nov 21, 2023 at 7:54 PM Lianqi Wang @.***> wrote:
What cuda version do you have?
On Tue, Nov 21, 2023 at 5:22 PM Max Millar-Blanchaer < @.***> wrote:
Hi Lianqi,
When following the instructions to compile the code, I get a series of errors that seem to be related to the gpu.
The first is:
/usr/bin/ld: /usr/bin/ld: /usr/bin/ld: /home/maxmb/Library/maos_compiled/cuda/sim/.libs/libcusim.a(cudata.o): in function
gpu_dbg': /home/maxmb/Library/maos/cuda/sim/cudata.cu:53: undefined reference to
cudaGetDeviceProperties_v2'and then there are 10 or slight more with a similar pattern.
This is based on the most recent commit in the main branch.
Any advice would be much appreciated! I'm not sure if it's my computer or the something in the cudata.cu file.
Best wishes, Max
— Reply to this email directly, view it on GitHub https://github.com/lianqiw/maos/issues/11, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AABGVABESP3LYQTJRBXELY3YFVHUNAVCNFSM6AAAAAA7VONED2VHI2DSMVQWIX3LMV43ASLTON2WKOZSGAYDKMZZG4ZTSMA>
. You are receiving this because you are subscribed to this thread.Message ID: @.***>
— Reply to this email directly, view it on GitHub https://github.com/lianqiw/maos/issues/11#issuecomment-1822048210, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEK2SJHIIEJLUTL6D7ISCQDYFVZQRAVCNFSM6AAAAAA7VONED2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRSGA2DQMRRGA . You are receiving this because you authored the thread.Message ID: @.***>
This is the same version I have. Is the build in an empty folder? If not, can you try compiling from empty folder? If it already the case, can you type make V=1
and copy me the relevant lines at the error message?
Here's what I got with make V=1
in an empty folder.
I went back maybe a bit further than I needed to must in case:
Making all in test
depbase=`echo gpu_mvm_daemon.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
gcc -DHAVE_CONFIG_H -I. -I/home/maxmb/Library/maos/cuda/test -I../.. -I/home/maxmb/.aos/include -I /home/maxmb/Library/maos -std=c11 -fPIC -DDLONG -D_XOPEN_SOURCE=500 -D_POSIX_C_SOURCE=200809L -Wall -Wshadow -Wextra -Wno-missing-braces -Wno-missing-field-initializers -g -O3 -mtune=native -ftree-vectorize -ffast-math -fno-finite-math-only -fopenmp-simd -fopenmp -MT gpu_mvm_daemon.o -MD -MP -MF $depbase.Tpo -c -o gpu_mvm_daemon.o /home/maxmb/Library/maos/cuda/test/gpu_mvm_daemon.c &&\
mv -f $depbase.Tpo $depbase.Po
/home/maxmb/anaconda3/bin/nvcc -ccbin gcc -I/home/maxmb/anaconda3/include -O3 -g -DHAVE_CONFIG_H -I/home/maxmb/Library/maos_compiled -lineinfo -Wno-deprecated-gpu-targets -arch=compute_50 --compiler-options " -Wall -Wshadow -Wextra -Wno-missing-braces -Wno-missing-field-initializers -g -O3 -mtune=native -ftree-vectorize -ffast-math -fno-finite-math-only -I/home/maxmb/.aos/include -fopenmp -fPIC -DDLONG -Wno-unused-parameter -Wno-unused-value -D__STRICT_ANSI__ " -c -o mvm_daemon.o /home/maxmb/Library/maos/cuda/test/mvm_daemon.cu
/bin/bash ../../libtool --tag=CC --mode=link gcc -I /home/maxmb/Library/maos -std=c11 -fPIC -DDLONG -D_XOPEN_SOURCE=500 -D_POSIX_C_SOURCE=200809L -Wall -Wshadow -Wextra -Wno-missing-braces -Wno-missing-field-initializers -g -O3 -mtune=native -ftree-vectorize -ffast-math -fno-finite-math-only -fopenmp-simd -fopenmp -Wl,--no-as-needed,--disable-new-dtags -no-fast-install -avoid-version -L/home/maxmb/anaconda3/lib -Wl,-rpath,/home/maxmb/anaconda3/lib -L/home/maxmb/.aos/lib64 -Wl,-rpath,/home/maxmb/.aos/lib64 -o gpu_mvm_daemon gpu_mvm_daemon.o mvm_daemon.o ../recon/libcurecon.la ../../lib/libaos.la -lfftw3 -lfftw3f -pthread -lfftw3_threads -lfftw3f_threads -lcholmod -l:liblapack.so.3 -l:libblas.so.3 -lrt -lz -lm -ldl
libtool: link: gcc -I /home/maxmb/Library/maos -std=c11 -fPIC -DDLONG -D_XOPEN_SOURCE=500 -D_POSIX_C_SOURCE=200809L -Wall -Wshadow -Wextra -Wno-missing-braces -Wno-missing-field-initializers -g -O3 -mtune=native -ftree-vectorize -ffast-math -fno-finite-math-only -fopenmp-simd -fopenmp -Wl,--no-as-needed -Wl,--disable-new-dtags -Wl,-rpath -Wl,/home/maxmb/anaconda3/lib -Wl,-rpath -Wl,/home/maxmb/.aos/lib64 -o gpu_mvm_daemon gpu_mvm_daemon.o mvm_daemon.o -L/home/maxmb/anaconda3/lib -L/home/maxmb/.aos/lib64 ../recon/.libs/libcurecon.a /home/maxmb/Library/maos_compiled/cuda/sim/.libs/libcusim.a /home/maxmb/Library/maos_compiled/cuda/math/.libs/libcumath.a -lcurand -lcusolver -lcusparse -lcufft -lcublas -lcudart -lstdc++ ../../lib/.libs/libaos.a /home/maxmb/Library/maos_compiled/math/.libs/libaomath.a /home/maxmb/Library/maos_compiled/sys/.libs/libaosys.a -lfftw3 -lfftw3f -lfftw3_threads -lfftw3f_threads -lcholmod -l:liblapack.so.3 -l:libblas.so.3 -lrt -lz -lm -ldl -pthread -fopenmp
/usr/bin/ld: mvm_daemon.o: in function `gpu_mvm_gpu_init(void*)':
/home/maxmb/Library/maos/cuda/test/mvm_daemon.cu:319: undefined reference to `cudaGetDeviceProperties_v2'
/usr/bin/ld: /home/maxmb/Library/maos_compiled/cuda/sim/.libs/libcusim.a(cudata.o): in function `gpu_dbg':
/home/maxmb/Library/maos/cuda/sim/cudata.cu:53: undefined reference to `cudaGetDeviceProperties_v2'
/usr/bin/ld: /home/maxmb/Library/maos_compiled/cuda/sim/.libs/libcusim.a(cudata.o): in function `gpu_init':
/home/maxmb/Library/maos/cuda/sim/cudata.cu:255: undefined reference to `cudaGetDeviceProperties_v2'
/usr/bin/ld: /home/maxmb/Library/maos/cuda/sim/cudata.cu:104: undefined reference to `cudaGetDeviceProperties_v2'
collect2: error: ld returned 1 exit status
make[3]: *** [Makefile:479: gpu_mvm_daemon] Error 1
make[2]: *** [Makefile:416: all-recursive] Error 1
make[1]: *** [Makefile:527: all-recursive] Error 1
make: *** [Makefile:458: all] Error 2
Can you cd
to /home/maxmb/anaconda3/lib
and issue nm -gD libcudart.so |grep cudaGetDevice
. Do you get the following output (ignore the address)?
000000000004b800 T cudaGetDevice 000000000004a500 T cudaGetDeviceCount 000000000004bcf0 T cudaGetDeviceFlags 0000000000070b70 T cudaGetDeviceProperties 000000000004a6a0 T cudaGetDeviceProperties_v2
If you do get these, I am not sure what the error would be. In that case,
can you try to install cuda using the system package manager (dnf in
rhel/centos) or nvidia installer and use --with-cuda=/path/to/cuda/lib
(e.g., /usr/local/cuda/lib) during configuration.
Thanks,
Lianqi
On Tue, Nov 21, 2023 at 10:33 PM Max Millar-Blanchaer < @.***> wrote:
Here's what I got with make V=1 in an empty folder.
I went back maybe a bit further than I needed to must in case:
Making all in test depbase=
echo gpu_mvm_daemon.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'
;\ gcc -DHAVE_CONFIG_H -I. -I/home/maxmb/Library/maos/cuda/test -I../.. -I/home/maxmb/.aos/include -I /home/maxmb/Library/maos -std=c11 -fPIC -DDLONG -D_XOPEN_SOURCE=500 -D_POSIX_C_SOURCE=200809L -Wall -Wshadow -Wextra -Wno-missing-braces -Wno-missing-field-initializers -g -O3 -mtune=native -ftree-vectorize -ffast-math -fno-finite-math-only -fopenmp-simd -fopenmp -MT gpu_mvm_daemon.o -MD -MP -MF $depbase.Tpo -c -o gpu_mvm_daemon.o /home/maxmb/Library/maos/cuda/test/gpu_mvm_daemon.c &&\ mv -f $depbase.Tpo $depbase.Po /home/maxmb/anaconda3/bin/nvcc -ccbin gcc -I/home/maxmb/anaconda3/include -O3 -g -DHAVE_CONFIG_H -I/home/maxmb/Library/maos_compiled -lineinfo -Wno-deprecated-gpu-targets -arch=compute_50 --compiler-options " -Wall -Wshadow -Wextra -Wno-missing-braces -Wno-missing-field-initializers -g -O3 -mtune=native -ftree-vectorize -ffast-math -fno-finite-math-only -I/home/maxmb/.aos/include -fopenmp -fPIC -DDLONG -Wno-unused-parameter -Wno-unused-value -D__STRICT_ANSI__ " -c -o mvm_daemon.o /home/maxmb/Library/maos/cuda/test/mvm_daemon.cu /bin/bash ../../libtool --tag=CC --mode=link gcc -I /home/maxmb/Library/maos -std=c11 -fPIC -DDLONG -D_XOPEN_SOURCE=500 -D_POSIX_C_SOURCE=200809L -Wall -Wshadow -Wextra -Wno-missing-braces -Wno-missing-field-initializers -g -O3 -mtune=native -ftree-vectorize -ffast-math -fno-finite-math-only -fopenmp-simd -fopenmp -Wl,--no-as-needed,--disable-new-dtags -no-fast-install -avoid-version -L/home/maxmb/anaconda3/lib -Wl,-rpath,/home/maxmb/anaconda3/lib -L/home/maxmb/.aos/lib64 -Wl,-rpath,/home/maxmb/.aos/lib64 -o gpu_mvm_daemon gpu_mvm_daemon.o mvm_daemon.o ../recon/libcurecon.la ../../lib/libaos.la -lfftw3 -lfftw3f -pthread -lfftw3_threads -lfftw3f_threads -lcholmod -l:liblapack.so.3 -l:libblas.so.3 -lrt -lz -lm -ldl libtool: link: gcc -I /home/maxmb/Library/maos -std=c11 -fPIC -DDLONG -D_XOPEN_SOURCE=500 -D_POSIX_C_SOURCE=200809L -Wall -Wshadow -Wextra -Wno-missing-braces -Wno-missing-field-initializers -g -O3 -mtune=native -ftree-vectorize -ffast-math -fno-finite-math-only -fopenmp-simd -fopenmp -Wl,--no-as-needed -Wl,--disable-new-dtags -Wl,-rpath -Wl,/home/maxmb/anaconda3/lib -Wl,-rpath -Wl,/home/maxmb/.aos/lib64 -o gpu_mvm_daemon gpu_mvm_daemon.o mvm_daemon.o -L/home/maxmb/anaconda3/lib -L/home/maxmb/.aos/lib64 ../recon/.libs/libcurecon.a /home/maxmb/Library/maos_compiled/cuda/sim/.libs/libcusim.a /home/maxmb/Library/maos_compiled/cuda/math/.libs/libcumath.a -lcurand -lcusolver -lcusparse -lcufft -lcublas -lcudart -lstdc++ ../../lib/.libs/libaos.a /home/maxmb/Library/maos_compiled/math/.libs/libaomath.a /home/maxmb/Library/maos_compiled/sys/.libs/libaosys.a -lfftw3 -lfftw3f -lfftw3_threads -lfftw3f_threads -lcholmod -l:liblapack.so.3 -l:libblas.so.3 -lrt -lz -lm -ldl -pthread -fopenmp /usr/bin/ld: mvm_daemon.o: in functiongpu_mvm_gpu_init(void*)': /home/maxmb/Library/maos/cuda/test/mvm_daemon.cu:319: undefined reference to
cudaGetDeviceProperties_v2' /usr/bin/ld: /home/maxmb/Library/maos_compiled/cuda/sim/.libs/libcusim.a(cudata.o): in functiongpu_dbg': /home/maxmb/Library/maos/cuda/sim/cudata.cu:53: undefined reference to
cudaGetDeviceProperties_v2' /usr/bin/ld: /home/maxmb/Library/maos_compiled/cuda/sim/.libs/libcusim.a(cudata.o): in functiongpu_init': /home/maxmb/Library/maos/cuda/sim/cudata.cu:255: undefined reference to
cudaGetDeviceProperties_v2' /usr/bin/ld: /home/maxmb/Library/maos/cuda/sim/cudata.cu:104: undefined reference to `cudaGetDeviceProperties_v2' collect2: error: ld returned 1 exit status make[3]: [Makefile:479: gpu_mvm_daemon] Error 1 make[2]: [Makefile:416: all-recursive] Error 1 make[1]: [Makefile:527: all-recursive] Error 1 make: [Makefile:458: all] Error 2— Reply to this email directly, view it on GitHub https://github.com/lianqiw/maos/issues/11#issuecomment-1822190841, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABGVAA26BLTQIXECDPLLW3YFWMDVAVCNFSM6AAAAAA7VONED2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRSGE4TAOBUGE . You are receiving this because you commented.Message ID: @.***>
Alright, so I get these ones: 000000000004c180 T cudaGetDevice@@libcudart.so.10.2 000000000004cdf0 T cudaGetDeviceCount@@libcudart.so.10.2 000000000004bc90 T cudaGetDeviceFlags@@libcudart.so.10.2 000000000004cc30 T cudaGetDeviceProperties@@libcudart.so.10.2
but nothing about cudaGetDeviceProperties_v2.
Perhaps I should try with the system package manager?
You seem to have a version mismatch between the headers (12.2) and library (10.2). Try reinstall cuda either in conda or in system package manager.
Lianqi
On Wed, Nov 22, 2023 at 3:29 PM Max Millar-Blanchaer < @.***> wrote:
Alright, so I get these ones: 000000000004c180 T cudaGetDevice@@libcudart.so.10.2 000000000004cdf0 T cudaGetDeviceCount@@libcudart.so.10.2 000000000004bc90 T cudaGetDeviceFlags@@libcudart.so.10.2 000000000004cc30 T cudaGetDeviceProperties@@libcudart.so.10.2
but nothing about cudaGetDeviceProperties_v2.
Perhaps I should try with the system package manager?
— Reply to this email directly, view it on GitHub https://github.com/lianqiw/maos/issues/11#issuecomment-1823624176, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABGVAB5SQUJGONW2H3FBILYF2DD5AVCNFSM6AAAAAA7VONED2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRTGYZDIMJXGY . You are receiving this because you commented.Message ID: @.***>
Thanks for the troubleshooting! After fighting with cuda installs for a while I was finally able to get it re-installed and now my MAOS+gpu setup works! Cheers, Max
Hi Lianqi,
When following the instructions to compile the code, I get a series of errors that seem to be related to the gpu.
The first is:
and then there are 10 or slight more with a similar pattern.
This is based on the most recent commit in the main branch.
Any advice would be much appreciated! I'm not sure if it's my computer or the something in the cudata.cu file.
Best wishes, Max