till-m commented 2 years ago

Resolves #86

codecov-commenter commented 2 years ago

Codecov Report

Merging #88 (3613207) into master (2964e6b) will increase coverage by 0.81%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master      #88      +/-   ##
==========================================
+ Coverage   86.82%   87.63%   +0.81%     
==========================================
  Files           6        6              
  Lines         258      275      +17     
==========================================
+ Hits          224      241      +17     
  Misses         34       34

Impacted Files	Coverage Δ
src/ScikitLearnAPI.jl	`40.54% <ø> (ø)`
src/LIBSVM.jl	`94.08% <100.00%> (+0.66%)`	:arrow_up:
src/ScikitLearnTypes.jl	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 2964e6b...3613207. Read the comment docs.

ablaom commented 2 years ago

@till-m Thanks for this.

This looks good to me, my comment regarding use of Function not withstanding. You'll still need a maintainer to review this.

till-m commented 2 years ago

Thank you for your comments @ablaom, much appreciated! I'm currently waiting on the fate of #87 to be decided so that I can fix any merge conflicts that might arise, after which I will mark this PR ready for review.

iblislin commented 2 years ago

oh, well, I need some time to check PR #87. (quite busy this week

barucden commented 2 years ago

I have a general question. Does this have any effect on serialization?

TBH I have never serialized the SVM structure but I have always assumed that it would be possible. Can a Function be serialized?

till-m commented 2 years ago

I would appreciate if someone could re-run the tests from 9ab1ef1. At least the MacOS test seems to fail for unrelated reasons (failed to resolve address for github.com: nodename nor servname provided, or not known), not sure what the problem with the windows test is. I think I don't have permission to re-run jobs.

till-m commented 2 years ago

Thanks for re-running @iblis17, looks like the failures were a fluke. I'll mark this as ready for review now.

@barucden

Can a Function be serialized?

it seems like this is indeed possible in principle:

using Serialization
foo(x) = x^2
serialize("foo.file", foo)
bar = deserialize("foo.file")
bar(2) # 4

I would guess that if one tries hard enough, the serialization will probably break down but hopefully such cases are very niche.

barucden commented 2 years ago

Unfortunately, it's not that easy. If you kill the Julia session after the serialization and try to load the file:

using Serialization
bar = deserialize("foo.file") #ERROR: UndefVarError: #foo not defined

foo(x) = x^2
bar = deserialize("foo.file")
bar(2) # 4

Essentially, the function can be deserialized only if it is defined within the session.

JLD.jl deals with a related issue (serializing custom structures) via addrequire function. IMHO, it would be sufficient to extend our README, document how to use the callable kernel, and mention that the kernel must be defined before deserialization.

Edit: It would be great if we had a test that covers the serialization/deserialization when a callable kernel is used.

till-m commented 2 years ago

Okay, that's... decidedly less useful.

a test that covers the serialization/deserialization when a callable kernel is used

Like this, i.e. just to check whether it works if the kernel-callable is defined? Or are you trying to cover something else?

distance(x) = x[1]^2 + x[2]^2

X = [0.72  0.68  0.28  0.75  0.47  0.26  0.95  0.0   0.95  0.39;
    0.49  0.07  0.67  0.94  0.4   0.98  0.21  0.29  0.91  0.16]

y = 0.5 .< [distance(x) for x in eachcol(X)]

kernel(x1, x2) = x1' * x2 + distance(x1) * distance(x2)

model  = svmtrain(X, y, kernel=kernel)
serialize("foo.file", model)

model₂ = deserialize("foo.file")
T = [0.57  0.56  0.57  0.51;
    0.9   0.37  0.04  0.76]

ŷ = 0.5 .< [distance(x) for x in eachcol(T)]
ỹ, _ = svmpredict(model₂, T)
@test ŷ == ỹ

barucden commented 2 years ago

Yes, I think it could be even simpler:

distance(x) = x[1]^2 + x[2]^2

X = [0.72  0.68  0.28  0.75  0.47  0.26  0.95  0.0   0.95  0.39;
     0.49  0.07  0.67  0.94  0.4   0.98  0.21  0.29  0.91  0.16]

y = 0.5 .< [distance(x) for x in eachcol(X)]

kernel(x1, x2) = x1' * x2 + distance(x1) * distance(x2)

model₁ = svmtrain(X, y, kernel=kernel)
serialize("foo.file", model)

model₂ = deserialize("foo.file")

y₁ = svmpredict(model₁, X)
y₂ = svmpredict(model₂, X)

@test y₁ == y₂

Just to check that the loaded model works (and works the same).

till-m commented 2 years ago

The windows julia 1.3 test failed again in 2e8c64e, not sure why. Other than that I think I got all of the changes and it seems to be working.

iblislin commented 2 years ago

ah, LGTM. :+1:

I'm going to merge this PR tmr if no objection.

till-m commented 2 years ago

I think one last question to settle would be whether the ScikitLearn.jl API should also be supported. I assume that requires only the relaxation of a few type checks.

iblislin commented 2 years ago

I think one last question to settle would be whether the ScikitLearn.jl API should also be supported. I assume that requires only the relaxation of a few type checks.

Yeah, it should also be supported. You want to include it in this PR or add another PR?

till-m commented 2 years ago

You want to include it in this PR or add another PR?

Whichever is more standard or whichever you prefer.

iblislin commented 2 years ago

Whichever is more standard or whichever you prefer.

While you start to modify the code, if you find that the changes are complex and maybe need more than 3 commits to clarify them, adding a new PR for them will be a better option. Otherwise, including them in the original PR is fine.

till-m commented 2 years ago

I removed the type checks and added a test case. Seems to be working.

iblislin commented 2 years ago

Out of my curiosity, is there a kernel function such that the Gram matrix isn't symmetric (Gram[i, j] != Gram[j, i])?

till-m commented 2 years ago

Kernels are associated with an inner product of some Hilbert space by a feature map φ that maps to that Hilbert space: k(x1, x2) = <φ(x1), φ(x2)> and inner products are (hermetian) symmetric. Since a symmetric kernel function would lead to a symmetric Gram matrix, all real kernels produce Gram matrices that are symmetric. Now I have never seen a complex kernel function (in a machine learning context) and I don't know if they exist (or if LIBSVM supports complex valued precomputed gram matrices). I've checked a few textbooks (Mathematics for Machine Learning by Deisenroth, Support Vector Machines by Steinwart and Schölkopf/Smola's Learning with Kernels) and they all explicitly mention that the Gram matrix is symmetric and/or define kernels as real-valued functions.

iblislin commented 2 years ago

Thanks for your contributions!

iblislin commented 2 years ago

I will release these patches with version = 0.8.0 tmr.

iblislin commented 2 years ago

Kernels are associated with an inner product of some Hilbert space by a feature map φ that maps to that Hilbert space: k(x1, x2) = <φ(x1), φ(x2)> and inner products are (hermetian) symmetric. Since a symmetric kernel function would lead to a symmetric Gram matrix, all real kernels produce Gram matrices that are symmetric. ...

A side note: I just found a work [1]. IIUC, if user can provide a positive definite matrix, we can safely pass it to libsvm.

[1] Tsuda, K. (1999). Support vector classifier with asymmetric kernel functions. In in European Symposium on Artificial Neural Networks (ESANN.

till-m commented 2 years ago

Thanks for the guidance @ablaom @barucden @iblis17.

JuliaML / LIBSVM.jl

Callable kernel #88

Codecov Report