Xtra-Computing / thundersvm

ThunderSVM: A Fast SVM Library on GPUs and CPUs
Apache License 2.0
1.56k stars 217 forks source link

Memory limiting is ignored on CUDA devices #189

Open furkantektas opened 4 years ago

furkantektas commented 4 years ago

Hi, I'm trying to train a C-SVM or OneClassSVM using the Python API or CLI tools, both failed to allocate cuda memory on master branch. Although I limit the memory needed and have much more GPU/RAM memory than I provided, the code threw memory allocation exception. The training data looks like this:

X_train.shape: (144396, 2048), dtype: float32
y_train.shape: (144396,), dtype: int64

A sample code is as follows:

from thundersvm import OneClassSVM
from sklearn.multiclass import OneVsRestClassifier

svc = OneClassSVM(
    kernel="polynomial",
    degree=3,
    gamma="auto",
    coef0=0.0,
    nu=0.1,
    tol=0.001,
    shrinking=False,
    cache_size=1000,
    verbose=False,
    max_iter=-1,
    n_jobs=-1,
    max_mem_size=6000,
    random_state=0,
)

model = OneVsRestClassifier(svc).fit(X_train, y_train)

I have tried similar code snippets for training using the Python API and all failed. Although, I can train the model with the exactly same parameters via CLI. In that case testing through CLI didn't work.

Another working CLI example command:

thundersvm-train -s 0 -t 0 -d 3 -m 6500 -u 0 -q train.data.file svm.model.file

After I compiled the thundersvm without CUDA, I was able to run predict, yet it took so long (didn't measure but about an hour or more). Test data has the following properties:

X_test.shape: (25000, 2048), dtype: float32
y_test.shape: (25000,), dtype: int64

The prediction command I used:

thundersvm-predict -m 2000 -o -1 test.data cmdline-rbf.model predictions

You can reproduce the bug using random numpy arrays on master branch. I've compiled it from source using archlinux on two different machines with NVidia 1080 and 2080Ti. Both have failed due to memory allocation error.

The NVidia Driver version I'm using is 440.36 and cuda toolkit of the conda environment is 10.0.130 while the system-wide installed cuda version is 10.2.

Thanks,

QinbinLi commented 4 years ago

Hi, @furkantektas

I'm sorry that I cannot reproduce the bug using random numpy arrays on our machine (cuda 10.0, RTX 2080Ti). The code is as follows:

from thundersvm import *
import numpy as np
from sklearn.multiclass import OneVsRestClassifier

X_train = np.random.rand(144396,2048)
y_train = np.random.randint(2,size=144396)

svc = OneClassSVM(
        kernel="polynomial",
        degree=3,
        gamma="auto",
        coef0=0.0,
        nu=0.1,
        tol=0.001,
        shrinking=False,
        cache_size=1000,
        verbose=False,
        max_iter=-1,
        n_jobs=-1,
        max_mem_size=6000,
        random_state=0,
)
model = OneVsRestClassifier(svc).fit(X_train, y_train)

ThunderSVM can successfully run and the GPU memory usage doesn't exceed 6GB. I'm not sure whether the bug is related to the training data. Can you provide the training data set or another example to reproduce the bug?

Liqian-czp commented 4 years ago

Hi, I have same issues. I install thundersvm-cu90 in my machine (cuda9.0, GTX1080Ti). But cann't allocate cuda memory on master branch when I train my data (data is similar to furkantektas's).