JuliaML / LIBSVM.jl

LIBSVM bindings for Julia
Other
88 stars 35 forks source link

Add support for precomputed kernel. #73

Closed barucden closed 3 years ago

barucden commented 3 years ago

This is an effort to properly support precomputed kernel.

There are few more steps left:

Closes #71

codecov-commenter commented 3 years ago

Codecov Report

Merging #73 (852b2c6) into master (56e9a2e) will increase coverage by 1.86%. The diff coverage is 98.46%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #73      +/-   ##
==========================================
+ Coverage   84.75%   86.61%   +1.86%     
==========================================
  Files           5        6       +1     
  Lines         223      254      +31     
==========================================
+ Hits          189      220      +31     
  Misses         34       34              
Impacted Files Coverage Δ
src/LIBSVM.jl 93.24% <94.11%> (-0.70%) :arrow_down:
src/nodes.jl 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 56e9a2e...852b2c6. Read the comment docs.

iblislin commented 3 years ago

Are there any sample code of precomputed kernel in C that can help us for verifying these changes?

barucden commented 3 years ago

Are there any sample code of precomputed kernel in C that can help us for verifying these changes?

There are some tests in scikit learn. The test that I have added is loosely based on it. When I passed the test-case problem to svm-train (I mean the CLI program from LIBSVM) directly, I received different results than those in scikit. In scikit, the indices of the support vectors are [1, 3], while I saw [2, 4]. I believe this is because LIBSVM uses one-based indexing, and scikit transforms that to zero-based (did not verify though). Next, coefs are differently ordered in scikit ([[-0.25, .25]] vs our [0.25; -0.25]).

Anyway, I think we can use svm-train to produce expected results for our test cases.

barucden commented 3 years ago

Prediction now works (according to the simple test case, at least).

barucden commented 3 years ago

I have added two more test cases. I think we can consider it working.

barucden commented 3 years ago

Ok, I guess this is ready for reviews now.

One thing that is not so great is that sparse matrices will be converted to full matrices in the case of precomputed kernel. The reason why I did this is that I could not figure out how to reuse the code that we use for sparse matrices in the case of non-precomputed kernels. I don't think this is a significant issue since kernels tend to produce non-zero values anyway.

iblislin commented 3 years ago

@barucden I'm going to make a new release for this changes. Are there any planing PRs desired to be shipped with the new release?

ablaom commented 3 years ago

@barucden Thank you indeed for this work. I realize it was more than you bargained for.

While it's fresh in your mind, could you add some brief clue in the readme documentation on the new option to specify the kernel? That will expedite updating the MLJ interface.

barucden commented 3 years ago

@barucden I'm going to make a new release for this changes. Are there any planing PRs desired to be shipped with the new release?

We could address #61 in this release if you are interested. I still have the code.

@barucden Thank you indeed for this work. I realize it was more than you bargained for.

While it's fresh in your mind, could you add some brief clue in the readme documentation on the new option to specify the kernel? That will expedite updating the MLJ interface.

I added a few notes to the docs of the relevant methods (svmtrain, svmpredict) already, but I agree that a small example that shows the whole thing might help too.