Q: How can TERagwitz.m gain from GPU?

dwuab commented 9 years ago

The documentation for Trentool only mentions that GPU can help the ensemble method. However, from the source files, it seems that the GPU code only do nearest neighbor searching. Is it possible to incorporate GPU code into function such as TERagwitz?

mwibral commented 9 years ago

Hi,

the problem is how to fill the GPU with computations. In the ensemble method the original data and a bunch of equally sized surrogate data sets go to the GPU, and they're all crunched siultaneously. This basically gives you the surrogate stats with no additional time.

In the standard method, there is only one original data piece and one surrogate data set for each trial (instead of ~1000). Other channel pairs may have slighlty different datga sizes after embedding because of the autocorrelation deacy time (ACT) and the embedding optimization. So they can't go to the card at the same time. We are looking into workarounds for that at the moment.

Best, Michael.

On 22.07.2015 03:48, samuelandjw wrote:

The documentation for Trentool only mentions that GPU can help the ensemble method. However, from the source files, it seems that the GPU code only do nearest neighbor searching. Is it possible to incorporate GPU code into function such as |TERagwitz|?

— Reply to this email directly or view it on GitHub https://github.com/trentool/TRENTOOL3/issues/13.Web Bug from https://github.com/notifications/beacon/AIqYGslEDilqwlNvB_Gy4fUWvtKcNVy8ks5ofu4EgaJpZM4FdO8J.gif

dwuab commented 9 years ago

@mwibral I'm interested in looking into this problem too. By the way, how can I view the source of int cudaFindKnn? Update: I have very long time series such that even Ragwitz "test" would take ages to complete. I'm trying to find a way to use GPU to speed up the knn neighbors searching for Ragwitz test.

dwuab commented 9 years ago

@mwibral I cannot find the source code for cudaFindKnn anywhere in the package. The .ptx file in libgpuKnnLibrary.a is low-level Cuda-specific assembly code and I cannot understand it.

mwibral commented 9 years ago

Dear Samuel,

I checked and indeed the cuda code is missing. I am traveling at the moment, but will upload it next week.

Best, Michael

On 28.07.2015 04:30, samuelandjw wrote:

@mwibral https://github.com/mwibral I cannot find the source code for |cudaFindKnn| anywhere in the package. The |.ptx| file in |libgpuKnnLibrary.a| is low-level Cuda-specific assembly code and I cannot understand it.

— Reply to this email directly or view it on GitHub https://github.com/trentool/TRENTOOL3/issues/13#issuecomment-125412164.Web Bug from https://github.com/notifications/beacon/AIqYGihs9CwTxVOgLb_aTcxLp0dytCLGks5ohuCsgaJpZM4FdO8J.gif

pwollstadt commented 9 years ago

Hi Samuel,

I uploaded the source code for the CUDA functions (see here), so you can have a look.

Best, Patricia

dwuab commented 9 years ago

@pwollstadt Thanks! I just changed the path to Matlab and ran make on a machine that has N-cards and cuda and I got the following error messages:

[dwuab@login-0 cuda]$ make
/usr/local/cuda/bin/nvcc -m64  -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -Xcompiler -fPIC -c gpuKnnLibrary.cu
ar -r libgpuKnnLibrary.a gpuKnnLibrary.o
/usr/local/matlab-R2014a/bin/mex -L. -lgpuKnnLibrary -v fnearneigh_gpu.cpp -L/usr/local/cuda/lib64 -lcudart -lcusparse -lcublas
Verbose mode is on.
Neither -compatibleArrayDims nor -largeArrayDims is selected.
     Using -compatibleArrayDims. In the future, MATLAB will require the use of
     -largeArrayDims and remove the -compatibleArrayDims option.
     For more information:
     http://www.mathworks.com/help/matlab/matlab_external/upgrading-mex-files-to-use-64-bit-api.html.
No MEX options file identified; looking for an implicit selection.
... Looking for compiler 'g++' ...
... Executing command 'which g++' ...Yes ('/usr/bin/g++').
... Executing command 'g++ -print-file-name=libstdc++.so' ...Yes ('/usr/lib/gcc/x86_64-redhat-linux/4.4.7/libstdc++.so').
Found installed compiler 'g++'.
Options file details
-------------------------------------------------------------------
    Compiler location: $GCC_DIR
    Options file: /usr/local/matlab-R2014a/bin/glnxa64/mexopts/g++_glnxa64.xml
    CMDLINE1 : /usr/bin/g++ -c -DMX_COMPAT_32   -D_GNU_SOURCE -DMATLAB_MEX_FILE  -I"/usr/local/matlab-R2014a/extern/include" -I"/usr/local/matlab-R2014a/simulink/include" -ansi -fexceptions -fPIC -fno-omit-frame-pointer -pthread -O -DNDEBUG /d1/dwuab/TRENTOOL3/cuda/fnearneigh_gpu.cpp -o /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o
    CMDLINE2 : /usr/bin/g++ -pthread -Wl,--no-undefined  -shared -O -Wl,--version-script,"/usr/local/matlab-R2014a/extern/lib/glnxa64/mexFunction.map" /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o   -lgpuKnnLibrary  -lcudart  -lcusparse  -lcublas   -L.  -L/usr/local/cuda/lib64   -Wl,-rpath-link,/usr/local/matlab-R2014a/bin/glnxa64 -L"/usr/local/matlab-R2014a/bin/glnxa64" -lmx -lmex -lmat -lm -lstdc++ -o fnearneigh_gpu.mexa64
    CMDLINE3 : rm -f /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o
    CXX : /usr/bin/g++
    DEFINES : -DMX_COMPAT_32   -D_GNU_SOURCE -DMATLAB_MEX_FILE 
    MATLABMEX : -DMATLAB_MEX_FILE 
    CXXFLAGS : -ansi -fexceptions -fPIC -fno-omit-frame-pointer -pthread
    INCLUDE : -I"/usr/local/matlab-R2014a/extern/include" -I"/usr/local/matlab-R2014a/simulink/include"
    CXXOPTIMFLAGS : -O -DNDEBUG
    CXXDEBUGFLAGS : -g
    LDXX : /usr/bin/g++
    LDFLAGS : -pthread -Wl,--no-undefined 
    LDTYPE : -shared
    LINKEXPORT : -Wl,--version-script,"/usr/local/matlab-R2014a/extern/lib/glnxa64/mexFunction.map"
    LINKLIBS : -lgpuKnnLibrary  -lcudart  -lcusparse  -lcublas   -L.  -L/usr/local/cuda/lib64   -Wl,-rpath-link,/usr/local/matlab-R2014a/bin/glnxa64 -L"/usr/local/matlab-R2014a/bin/glnxa64" -lmx -lmex -lmat -lm -lstdc++
    LDOPTIMFLAGS : -O
    LDDEBUGFLAGS : -g
    OBJEXT : .o
    LDEXT : .mexa64
    GCC : /usr/bin/g++
    CPPLIB_DIR : /usr/lib/gcc/x86_64-redhat-linux/4.4.7/libstdc++.so
    MATLABROOT : /usr/local/matlab-R2014a
    ARCH : glnxa64
    SRC : /d1/dwuab/TRENTOOL3/cuda/fnearneigh_gpu.cpp
    OBJ : /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o
    OBJS : /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o 
    SRCROOT : /d1/dwuab/TRENTOOL3/cuda/fnearneigh_gpu
    DEF : /tmp/mex_46418310074994570_53220/fnearneigh_gpu.def
    EXP : fnearneigh_gpu.exp
    LIB : fnearneigh_gpu.lib
    EXE : fnearneigh_gpu.mexa64
    ILK : fnearneigh_gpu.ilk
    MANIFEST : fnearneigh_gpu.mexa64.manifest
    TEMPNAME : fnearneigh_gpu
    EXEDIR : 
    EXENAME : fnearneigh_gpu
    OPTIM : -O -DNDEBUG
    LINKOPTIM : -O
-------------------------------------------------------------------
Building with 'g++'.
/usr/bin/g++ -c -DMX_COMPAT_32   -D_GNU_SOURCE -DMATLAB_MEX_FILE  -I"/usr/local/matlab-R2014a/extern/include" -I"/usr/local/matlab-R2014a/simulink/include" -ansi -fexceptions -fPIC -fno-omit-frame-pointer -pthread -O -DNDEBUG /d1/dwuab/TRENTOOL3/cuda/fnearneigh_gpu.cpp -o /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o
/usr/bin/g++ -pthread -Wl,--no-undefined  -shared -O -Wl,--version-script,"/usr/local/matlab-R2014a/extern/lib/glnxa64/mexFunction.map" /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o   -lgpuKnnLibrary  -lcudart  -lcusparse  -lcublas   -L.  -L/usr/local/cuda/lib64   -Wl,-rpath-link,/usr/local/matlab-R2014a/bin/glnxa64 -L"/usr/local/matlab-R2014a/bin/glnxa64" -lmx -lmex -lmat -lm -lstdc++ -o fnearneigh_gpu.mexa64
/tmp/mex_46418310074994570_53220/fnearneigh_gpu.o: In function `mexFunction':
fnearneigh_gpu.cpp:(.text+0x268): undefined reference to `cudaFindKnn(int*, float*, float*, float*, int, int, int, int, int)'
collect2: ld returned 1 exit status

make: *** [mex] Error 255

Looks like something wrong in the linking stage.

pwollstadt commented 9 years ago

@samuelandjw thanks for letting us know. I forwarded this error to Mario Martínez Zarzuela, who programmed the CUDA functions. I'll let you know as soon as possible.

dwuab commented 9 years ago

@pwollstadt I found the solution to the compilation problem. On my University's GPU cluster, mex actually invokes g++ to do the linking. Removing the extern "C" {...} surrounding cudaFindKnn and cudaFindRSAll works for me. I think we should either: 1) specify g++ to be the linker in the mex step and remove extern "C" {...} in the two above-mentioned functions. 2) specify gcc to be the linker.

trentool / TRENTOOL3

Q: How can TERagwitz.m gain from GPU? #13