Closed igorsieradzki closed 9 years ago
Method cannot be implemented since there is no sufficient data in SVMConfiguration
. All LibSVM
types should be converted to armadillo
matrices, vectors etc.
Reopening #81, since it seems that not everything is done, e.g. support vectors or at least is not commented, for example (???)
//universal parameters
ama::vec w; //d
Before I merge the PR, could you quickly summarize the status of both svmlight and libsvm predictions methods?
Edit: I can see that we got 'our' prediction only with linear kernel.
As @kudkudak mentioned in #304, we are discussing current issue here. Changing milestone, due to a bug discovered by @igorsieradzki. I prefer to wait for someone more R-familiar to investigate with SVM data logic in R part. Meanwhile I am writing remaining kernels.
I've pushed small commit to svm-wrapper
branch, I recommend merging that to your branches, it should fix the strange integers bug.
Basically what happened: after changing data type to .RData breast_cancer data got saved as factor
type, meaning that when we convert it to a matrix, we get counts of certain values, not those values themself.
No. More. Factors.
Many many thanks for investigating! I am happy that my speculations were good. Getting your code immediately!
Linear, Poly, RBF and Sigmoid are now available in SVMClient.
Here are results from new source('tests/testthat/benchmark.R')
:
[1] "0. linear kernel:"
[1] "e1071 acc: 0.972182"
[1] "gmum libsvm acc: 0.970717"
[1] "gmum svmlight acc: 0.972182"
[1] "gmum libsvm 2e acc: 0.972182"
[1] "1. poly kernel:"
[1] "e1071 acc: 0.998536"
[1] "gmum libsvm acc: 0.970717"
[1] "gmum svmlight acc: 0.972182"
[1] "gmum libsvm 2e acc: 0.975110"
[1] "2. rbf kernel:"
[1] "e1071 acc: 0.998536"
[1] "gmum libsvm acc: 0.985359"
[1] "gmum svmlight acc: 0.970717"
[1] "gmum libsvm 2e acc: 0.961933"
[1] "3. sigmoid kernel:"
[1] "e1071 acc: 0.948755"
[1] "gmum libsvm acc: 0.967789"
[1] "gmum svmlight acc: 0.973646"
[1] "gmum libsvm 2e acc: 0.973646"
So, there are some observations, according to the results and my current knowledge:
LibSVM accuracy for Linear kernel is poor, it is because maybe I do not know that LibSVMRunner
is doing something additional, or parameters are wrong. Two solutions:
LibSVMRunner
code and correct SVMClient:predict()
implementation,SVMClient::predict()
and make a comment / change (remember to make no conflicts with SVMLight).(I'll try to investigate as fast as possible)
After adding klaR::svmlight
here are current results of the benchmark:
[1] "0. linear kernel:"
[1] "e1071 acc: 0.972182"
[1] "gmum libsvm acc: 0.970717"
[1] "klaR svmlight acc: 0.972182"
[1] "gmum svmlight acc: 0.972182"
[1] "gmum libsvm 2e acc: 0.972182"
[1] "1. poly kernel:"
[1] "e1071 acc: 0.998536"
[1] "gmum libsvm acc: 0.970717"
[1] "klaR svmlight acc: 0.975110"
[1] "gmum svmlight acc: 0.972182"
[1] "gmum libsvm 2e acc: 0.975110"
[1] "2. rbf kernel:"
[1] "e1071 acc: 0.998536"
[1] "gmum libsvm acc: 0.985359"
[1] "klaR svmlight acc: 0.970717"
[1] "gmum svmlight acc: 0.970717"
[1] "gmum libsvm 2e acc: 0.961933"
[1] "3. sigmoid kernel:"
[1] "e1071 acc: 0.948755"
[1] "gmum libsvm acc: 0.967789"
[1] "klaR svmlight acc: 0.973646"
[1] "gmum svmlight acc: 0.973646"
[1] "gmum libsvm 2e acc: 0.973646"
(Many lines of klaR::svmlight
's prediction output ommited. Couldn't get to mute this -- as commented in source file)
Questions: I think there might be something in Poly kernel calculations? And I still need to investigate LibSVM error.
:+1: for comparison with klaR
Btw. I don't know if this is common knowledge, but e1071
scales data, so that might be the reason for higher accuracy.
Hey... Thanks for the info! In that case I think this task is done [1]. More elaborate comparisions / tests / accuracies should go to #312 (if it is not done yet).
[1] I've been studying LibSVMRunner
code in comparision with LibSVM
prediction calculations, and I can tell that in my opinion all parameters are correctly being stored within SVMConfiguration
.
Here we go with pull request.
Like in SVMLight Call @igorsieradzki in case of any questions.