JuliaML / LIBSVM.jl

LIBSVM bindings for Julia
Other
88 stars 35 forks source link

How does one specify a pre-computed kernel? #71

Closed ablaom closed 3 years ago

ablaom commented 3 years ago

In the original C-code one puts the kernel in a file passed as a command-line argument: https://github.com/cjlin1/libsvm/blob/eedbb147ea79af44f2ecdca1064f2c6a8fe6462d/README#L138

How does one do it for LIBSVM.jl?

iblislin commented 3 years ago

I think you can just use KERNEL.Precomputed

ablaom commented 3 years ago

Yes, but where's the interface point for the actual kernel you have "pre-computed". Setting kernel=KERNEL.Precomputed just flags that you want to use a pre-computed kernel but does not allow you to say what the kernel is. At least that is my understanding. In the C-version, you could put this matrix in a file that you specify in the command line call to the fitting function.

See also https://discourse.julialang.org/t/how-do-i-specify-a-precomputed-kernel-in-svm-using-libsvm-or-mlj/61962

barucden commented 3 years ago

LIBSVM.jl just wraps the calls to the C implementation. Looking at the source of libsvm, it seems that the precomputed kernel values should be in the input matrix X. See

double kernel_precomputed(int i, int j) const {
    return x[i][(int)(x[j][0].value)].value;  
}

It seems that, in this case, the first column of matrix X is reserved for indices, meaning (int)(x[j][0].value) returns the index of j-th instance (why not use j directly though?). Then, x[i][(int)(x[j][0].value)].value probably results in k(i, j).

So since Julia uses 1-based indexing... Try passing X where X[i][1] = i and X[i][j] = k(i, j - 1) for j > 1? Like so

1  k(x_1, x_1)  k(x_1, x_2)  k(x_1, x_3)  k(x_1, x_4)
2  k(x_2, x_1)  k(x_2, x_2)  k(x_2, x_3)  k(x_2, x_4)
3  k(x_3, x_1)  k(x_3, x_2)  k(x_3, x_3)  k(x_3, x_4)
4  k(x_4, x_1)  k(x_4, x_2)  k(x_4, x_3)  k(x_4, x_4)

Could you verify that this is actually correct? If it is, we should definitely consider adding this to our documentation. We could also improve the API by accepting the Gram matrix and automatically prepend the first column of the required indices.

Edit: for prediction, the first column can contain arbitrary values, they will not be used by libsvm.

ablaom commented 3 years ago

@barucden Many thanks for this investigation!

@mrussek Is this something you can confirm for us?

ablaom commented 3 years ago

I mean check on directly with LIBSVM.jl. If true, I'm sure we could make the MLJ interface a little more intuitive.

mrussek commented 3 years ago

I'm trying to use svmtrain in Julia in that way and it doesn't seem to work. The following produces a kernel like you described on a synthetic set of data:

using LIBSVM
using LinearAlgebra

X = randn(100, 2)
y = X[:, 1] .> 0.3;

K = zeros(100, 101)
K[:,1] = 1:100
for i in 1:100, j in 1:i
    K[i, j+1] = exp(-norm(X[i,:] .- X[j,:])^2)
    K[j, i+1] = K[i, j+1]
end

but then this errors out

model = svmtrain(K, y; kernel=Kernel.Precomputed)

with this error

Size of second dimension of training instance matrix
(101) does not match length of labels
(100)

Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:33
 [2] indices_and_weights(labels::BitVector, instances::Matrix{Float64}, weights::Nothing)
   @ LIBSVM ~/.julia/packages/LIBSVM/mQbZW/src/LIBSVM.jl:234
 [3] svmtrain(X::Matrix{Float64}, y::BitVector; svmtype::Type, kernel::LIBSVM.Kernel.KERNEL, degree::Int64, gamma::Float64, coef0::Float64, cost::Float64, nu::Float64, epsilon::Float64, tolerance::Float64, shrinking::Bool, probability::Bool, weights::Nothing, cachesize::Float64, verbose::Bool, nt::Int64)
   @ LIBSVM ~/.julia/packages/LIBSVM/mQbZW/src/LIBSVM.jl:344
 [4] top-level scope
   @ In[56]:14

seeming to indicate that the extra column is a problem.

If we drop it (or even just transpose K), training seems to work:

K = K[:,2:end]
model = svmtrain(K, y; kernel=Kernel.Precomputed)

But inference doesn't seem to work on new examples?

svmpredict(model, randn(2, 10))
Model has 100 but 10 provided

Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:33
 [2] svmpredict(model::LIBSVM.SVM{Bool}, X::Matrix{Float64}; nt::Int64)
   @ LIBSVM ~/.julia/packages/LIBSVM/mQbZW/src/LIBSVM.jl:383
 [3] svmpredict(model::LIBSVM.SVM{Bool}, X::Matrix{Float64})
   @ LIBSVM ~/.julia/packages/LIBSVM/mQbZW/src/LIBSVM.jl:378
 [4] top-level scope
   @ In[91]:1

PS: Just for clarity I'm not actually trying to use an RBF kernel but something more sophisticated, I know libsvm has RBF support

barucden commented 3 years ago

I've looked into it, and AFAIC LIBSVM.jl needs some adjustments to properly support precomputed kernels. I'll try to come up with something.

barucden commented 3 years ago

@ablaom @mrussek It is not as trivial as I original thought. The main issue is with creating the SVMNodes from the input matrix. In the case of precomputed kernel, our current method does not work as intended.

I think I am on the right track now but suggestions are welcome in #73. So far I have changed only training (not prediction).

mrussek commented 3 years ago

@barucden Thank you!