Open dwuab opened 9 years ago
Hi,
the problem is how to fill the GPU with computations. In the ensemble method the original data and a bunch of equally sized surrogate data sets go to the GPU, and they're all crunched siultaneously. This basically gives you the surrogate stats with no additional time.
In the standard method, there is only one original data piece and one surrogate data set for each trial (instead of ~1000). Other channel pairs may have slighlty different datga sizes after embedding because of the autocorrelation deacy time (ACT) and the embedding optimization. So they can't go to the card at the same time. We are looking into workarounds for that at the moment.
Best, Michael.
On 22.07.2015 03:48, samuelandjw wrote:
The documentation for Trentool only mentions that GPU can help the ensemble method. However, from the source files, it seems that the GPU code only do nearest neighbor searching. Is it possible to incorporate GPU code into function such as |TERagwitz|?
— Reply to this email directly or view it on GitHub https://github.com/trentool/TRENTOOL3/issues/13.Web Bug from https://github.com/notifications/beacon/AIqYGslEDilqwlNvB_Gy4fUWvtKcNVy8ks5ofu4EgaJpZM4FdO8J.gif
@mwibral I'm interested in looking into this problem too. By the way, how can I view the source of int cudaFindKnn
?
Update: I have very long time series such that even Ragwitz "test" would take ages to complete. I'm trying to find a way to use GPU to speed up the knn
neighbors searching for Ragwitz test.
@mwibral I cannot find the source code for cudaFindKnn
anywhere in the package. The .ptx
file in libgpuKnnLibrary.a
is low-level Cuda-specific assembly code and I cannot understand it.
Dear Samuel,
I checked and indeed the cuda code is missing. I am traveling at the moment, but will upload it next week.
Best, Michael
On 28.07.2015 04:30, samuelandjw wrote:
@mwibral https://github.com/mwibral I cannot find the source code for |cudaFindKnn| anywhere in the package. The |.ptx| file in |libgpuKnnLibrary.a| is low-level Cuda-specific assembly code and I cannot understand it.
— Reply to this email directly or view it on GitHub https://github.com/trentool/TRENTOOL3/issues/13#issuecomment-125412164.Web Bug from https://github.com/notifications/beacon/AIqYGihs9CwTxVOgLb_aTcxLp0dytCLGks5ohuCsgaJpZM4FdO8J.gif
Hi Samuel,
I uploaded the source code for the CUDA functions (see here), so you can have a look.
Best, Patricia
@pwollstadt Thanks!
I just changed the path to Matlab and ran make
on a machine that has N-cards and cuda and I got the following error messages:
[dwuab@login-0 cuda]$ make
/usr/local/cuda/bin/nvcc -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -Xcompiler -fPIC -c gpuKnnLibrary.cu
ar -r libgpuKnnLibrary.a gpuKnnLibrary.o
/usr/local/matlab-R2014a/bin/mex -L. -lgpuKnnLibrary -v fnearneigh_gpu.cpp -L/usr/local/cuda/lib64 -lcudart -lcusparse -lcublas
Verbose mode is on.
Neither -compatibleArrayDims nor -largeArrayDims is selected.
Using -compatibleArrayDims. In the future, MATLAB will require the use of
-largeArrayDims and remove the -compatibleArrayDims option.
For more information:
http://www.mathworks.com/help/matlab/matlab_external/upgrading-mex-files-to-use-64-bit-api.html.
No MEX options file identified; looking for an implicit selection.
... Looking for compiler 'g++' ...
... Executing command 'which g++' ...Yes ('/usr/bin/g++').
... Executing command 'g++ -print-file-name=libstdc++.so' ...Yes ('/usr/lib/gcc/x86_64-redhat-linux/4.4.7/libstdc++.so').
Found installed compiler 'g++'.
Options file details
-------------------------------------------------------------------
Compiler location: $GCC_DIR
Options file: /usr/local/matlab-R2014a/bin/glnxa64/mexopts/g++_glnxa64.xml
CMDLINE1 : /usr/bin/g++ -c -DMX_COMPAT_32 -D_GNU_SOURCE -DMATLAB_MEX_FILE -I"/usr/local/matlab-R2014a/extern/include" -I"/usr/local/matlab-R2014a/simulink/include" -ansi -fexceptions -fPIC -fno-omit-frame-pointer -pthread -O -DNDEBUG /d1/dwuab/TRENTOOL3/cuda/fnearneigh_gpu.cpp -o /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o
CMDLINE2 : /usr/bin/g++ -pthread -Wl,--no-undefined -shared -O -Wl,--version-script,"/usr/local/matlab-R2014a/extern/lib/glnxa64/mexFunction.map" /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o -lgpuKnnLibrary -lcudart -lcusparse -lcublas -L. -L/usr/local/cuda/lib64 -Wl,-rpath-link,/usr/local/matlab-R2014a/bin/glnxa64 -L"/usr/local/matlab-R2014a/bin/glnxa64" -lmx -lmex -lmat -lm -lstdc++ -o fnearneigh_gpu.mexa64
CMDLINE3 : rm -f /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o
CXX : /usr/bin/g++
DEFINES : -DMX_COMPAT_32 -D_GNU_SOURCE -DMATLAB_MEX_FILE
MATLABMEX : -DMATLAB_MEX_FILE
CXXFLAGS : -ansi -fexceptions -fPIC -fno-omit-frame-pointer -pthread
INCLUDE : -I"/usr/local/matlab-R2014a/extern/include" -I"/usr/local/matlab-R2014a/simulink/include"
CXXOPTIMFLAGS : -O -DNDEBUG
CXXDEBUGFLAGS : -g
LDXX : /usr/bin/g++
LDFLAGS : -pthread -Wl,--no-undefined
LDTYPE : -shared
LINKEXPORT : -Wl,--version-script,"/usr/local/matlab-R2014a/extern/lib/glnxa64/mexFunction.map"
LINKLIBS : -lgpuKnnLibrary -lcudart -lcusparse -lcublas -L. -L/usr/local/cuda/lib64 -Wl,-rpath-link,/usr/local/matlab-R2014a/bin/glnxa64 -L"/usr/local/matlab-R2014a/bin/glnxa64" -lmx -lmex -lmat -lm -lstdc++
LDOPTIMFLAGS : -O
LDDEBUGFLAGS : -g
OBJEXT : .o
LDEXT : .mexa64
GCC : /usr/bin/g++
CPPLIB_DIR : /usr/lib/gcc/x86_64-redhat-linux/4.4.7/libstdc++.so
MATLABROOT : /usr/local/matlab-R2014a
ARCH : glnxa64
SRC : /d1/dwuab/TRENTOOL3/cuda/fnearneigh_gpu.cpp
OBJ : /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o
OBJS : /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o
SRCROOT : /d1/dwuab/TRENTOOL3/cuda/fnearneigh_gpu
DEF : /tmp/mex_46418310074994570_53220/fnearneigh_gpu.def
EXP : fnearneigh_gpu.exp
LIB : fnearneigh_gpu.lib
EXE : fnearneigh_gpu.mexa64
ILK : fnearneigh_gpu.ilk
MANIFEST : fnearneigh_gpu.mexa64.manifest
TEMPNAME : fnearneigh_gpu
EXEDIR :
EXENAME : fnearneigh_gpu
OPTIM : -O -DNDEBUG
LINKOPTIM : -O
-------------------------------------------------------------------
Building with 'g++'.
/usr/bin/g++ -c -DMX_COMPAT_32 -D_GNU_SOURCE -DMATLAB_MEX_FILE -I"/usr/local/matlab-R2014a/extern/include" -I"/usr/local/matlab-R2014a/simulink/include" -ansi -fexceptions -fPIC -fno-omit-frame-pointer -pthread -O -DNDEBUG /d1/dwuab/TRENTOOL3/cuda/fnearneigh_gpu.cpp -o /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o
/usr/bin/g++ -pthread -Wl,--no-undefined -shared -O -Wl,--version-script,"/usr/local/matlab-R2014a/extern/lib/glnxa64/mexFunction.map" /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o -lgpuKnnLibrary -lcudart -lcusparse -lcublas -L. -L/usr/local/cuda/lib64 -Wl,-rpath-link,/usr/local/matlab-R2014a/bin/glnxa64 -L"/usr/local/matlab-R2014a/bin/glnxa64" -lmx -lmex -lmat -lm -lstdc++ -o fnearneigh_gpu.mexa64
/tmp/mex_46418310074994570_53220/fnearneigh_gpu.o: In function `mexFunction':
fnearneigh_gpu.cpp:(.text+0x268): undefined reference to `cudaFindKnn(int*, float*, float*, float*, int, int, int, int, int)'
collect2: ld returned 1 exit status
make: *** [mex] Error 255
Looks like something wrong in the linking stage.
@samuelandjw thanks for letting us know. I forwarded this error to Mario Martínez Zarzuela, who programmed the CUDA functions. I'll let you know as soon as possible.
@pwollstadt I found the solution to the compilation problem. On my University's GPU cluster, mex actually invokes g++ to do the linking. Removing the extern "C" {...}
surrounding cudaFindKnn
and cudaFindRSAll
works for me.
I think we should either:
1) specify g++ to be the linker in the mex step and remove extern "C" {...}
in the two above-mentioned functions.
2) specify gcc to be the linker.
The documentation for Trentool only mentions that GPU can help the ensemble method. However, from the source files, it seems that the GPU code only do nearest neighbor searching. Is it possible to incorporate GPU code into function such as
TERagwitz
?