Xtra-Computing / thundersvm

ThunderSVM: A Fast SVM Library on GPUs and CPUs
Apache License 2.0
1.56k stars 216 forks source link

jupyter notebook kernel dies unexpectedly #154

Closed Gunnvant closed 5 years ago

Gunnvant commented 5 years ago

I am trying to use svc.fit() (python wrapper), when I call it, my jupyter notebook instance dies. While building the library, I also got a cudaErrorMemoryAllocation

Here is the complete build trace: `$ cmake .. -- The C compiler identification is GNU 7.4.0 -- The CXX compiler identification is GNU 7.4.0 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done Compile with CUDA -- Found Threads: TRUE
-- Configuring done -- Generating done -- Build files have been written to: /home/gunnvant/thundersvm/build

(base) gunnvant@gunnvant-AB350N-Gaming-WIFI:~/thundersvm/build$ make -j [ 3%] Building NVCC (Device) object src/thundersvm/CMakeFiles/thundersvm.dir/kernel/thundersvm_generated_smo_kernel.cu.o [ 7%] Building NVCC (Device) object src/thundersvm/CMakeFiles/thundersvm.dir/kernel/thundersvm_generated_kernelmatrix_kernel.cu.o Scanning dependencies of target thundersvm [ 11%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/model/nusvr.cpp.o [ 15%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/model/oneclass_svc.cpp.o [ 19%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/model/svmmodel.cpp.o [ 26%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/model/svr.cpp.o [ 26%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/model/svc.cpp.o [ 30%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/solver/nusmosolver.cpp.o [ 34%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/solver/csmosolver.cpp.o [ 38%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/model/nusvc.cpp.o [ 42%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/util/metric.cpp.o [ 46%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/svm_R_interface.cpp.o [ 50%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/util/log.cpp.o [ 57%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/kernelmatrix.cpp.o [ 57%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/util/common.cpp.o [ 61%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/dataset.cpp.o [ 69%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/syncarray.cpp.o [ 69%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/syncmem.cpp.o [ 73%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/svm_interface_api.cpp.o [ 76%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/thundersvm-scikit.cpp.o [ 80%] Building CXX object src/thundersvm/CMakeFiles/thundersvm.dir/cmdparser.cpp.o In file included from /home/gunnvant/thundersvm/include/thundersvm/thundersvm.h:13:0, from /home/gunnvant/thundersvm/include/thundersvm/syncmem.h:8, from /home/gunnvant/thundersvm/src/thundersvm/syncmem.cpp:5: /home/gunnvant/thundersvm/src/thundersvm/syncmem.cpp: In destructor ‘thunder::SyncMem::~SyncMem()’: /home/gunnvant/thundersvm/include/thundersvm/util/common.h:37:65: warning: throw will always call terminate() [-Wterminate] if(error == cudaErrorMemoryAllocation) throw std::bad_alloc(); \ /home/gunnvant/thundersvm/src/thundersvm/syncmem.cpp:29:17: note: in expansion of macro ‘CUDA_CHECK’ CUDA_CHECK(cudaFree(device_ptr)); ^~~~~~ /home/gunnvant/thundersvm/include/thundersvm/util/common.h:37:65: note: in C++11 destructors default to noexcept if(error == cudaErrorMemoryAllocation) throw std::bad_alloc(); \ ^ /home/gunnvant/thundersvm/src/thundersvm/syncmem.cpp:29:17: note: in expansion of macro ‘CUDA_CHECK’ CUDA_CHECK(cudaFree(device_ptr)); ^~~~~~ [ 84%] Linking CXX shared library ../../lib/libthundersvm.so [ 84%] Built target thundersvm Scanning dependencies of target thundersvm-predict Scanning dependencies of target thundersvm-train [ 88%] Building CXX object src/thundersvm/CMakeFiles/thundersvm-predict.dir/thundersvm-predict.cpp.o [ 92%] Building CXX object src/thundersvm/CMakeFiles/thundersvm-train.dir/thundersvm-train.cpp.o [ 96%] Linking CXX executable ../../bin/thundersvm-predict [100%] Linking CXX executable ../../bin/thundersvm-train ` I am on ubuntu 18.04, cuda 10, amd 1600, gtx 1070, anaconda3 (python 3.7)

zeyiwen commented 5 years ago

We tried several times, but cannot reproduce your problem. Would you please provide a minimal example to help us reproduce the problem?

Gunnvant commented 5 years ago

An update, the kernel doesn't die, but the model is taking too long to train. I will close this issue and will report training times if they still remain long.