Xtra-Computing / thundersvm

ThunderSVM: A Fast SVM Library on GPUs and CPUs
Apache License 2.0
1.56k stars 217 forks source link

grid search on thundersvm? #87

Closed pardedetalenta closed 6 years ago

pardedetalenta commented 6 years ago

Hi, I'm a confused how to do grid search on thundersvm from python interface? Is it possible? Kindly update docs about it :) Thank you.

shijiashuai commented 6 years ago

Hi. For scikit python interface, kindly refer to grid search in scikit. Just pass thundersvm as its estimator.

pardedetalenta commented 6 years ago

thank you @shijiashuai , can I ask more question about how to load the model with load_from_file? Please help and thank you :)

QinbinLi commented 6 years ago

For example: x,y = load_svmlight_file("../dataset/test_dataset.txt") clf = SVC(verbose=True, gamma=0.5, C=100) clf.fit(x,y) clf.save_to_file('./model')

Next time you run python: clf = SVC() clf.load_from_file('model')

Then you can directly use clf to do prediction without fit(): x2,y2=load_svmlight_file("../dataset/test_dataset.txt") clf.predict(x2)

pardedetalenta commented 6 years ago

Thank you @GODqinbin , and I have tried using gridsearch from scikit but it give me this error:

FATAL [default] Check failed: [error == cudaSuccess] invalid device pointer 2018-07-04 01:18:15 PST2018-07-04 08:18:15,057 WARNING [default] Aborting application. Reason: Fatal log at [/floyd/home/thundersvm/src/thundersvm/syncmem.cpp:29]

What should i do?

QinbinLi commented 6 years ago

Can you tell us the instructions you used and share your dataset? I've tried the gridsearch for the test_dataset.txt and it worked fine.

pardedetalenta commented 6 years ago

You can see my code here : https://www.floydhub.com/talenta10/projects/ta-fit-svm-adaboost/126/files/ta-fit-smote-pca-svm-test.py and my dataset here : https://www.floydhub.com/talenta10/datasets/my-cool-dataset

Thank you before

pardedetalenta commented 6 years ago

Hi @GODqinbin, Or maybe I have to convert my CSV dataset to txt libsvm format? Is it possible? Kindly give me your advice for this problem. Thank you :)

QinbinLi commented 6 years ago

Since you have extracted x and y from CSV dataset, you needn't convert it to txt libsvm format. I can't locate your problem now since I didn't met your problem. I tried to train your dataset using svc.fit() and it worked fine. I can run your script on my machine after changing your script as following: clf = GridSearchCV(SVC(), param_grid, scoring=score, n_jobs=-1,verbose= 2, cv=5) -> clf = GridSearchCV(SVC(), param_grid, scoring=score, cv=5) I suggest that you can try to train the dataset first without grid search as following: clf = SVC(verbose=True, gamma=0.01, C=0.1) clf.fit(xtest, ytest) If you can run it successfully, then there should be no problem in dataset training and the problem should be in grid search.

pardedetalenta commented 6 years ago

Yeay, thank you @GODqinbin . I think n_jobs caused it. Thank you again :)