Closed ablaom closed 3 years ago
I think you can just use KERNEL.Precomputed
Yes, but where's the interface point for the actual kernel you have "pre-computed". Setting kernel=KERNEL.Precomputed
just flags that you want to use a pre-computed kernel but does not allow you to say what the kernel is. At least that is my understanding. In the C-version, you could put this matrix in a file that you specify in the command line call to the fitting function.
LIBSVM.jl just wraps the calls to the C implementation. Looking at the source of libsvm, it seems that the precomputed kernel values should be in the input matrix X
. See
double kernel_precomputed(int i, int j) const {
return x[i][(int)(x[j][0].value)].value;
}
It seems that, in this case, the first column of matrix X
is reserved for indices, meaning (int)(x[j][0].value)
returns the index of j
-th instance (why not use j
directly though?). Then, x[i][(int)(x[j][0].value)].value
probably results in k(i, j)
.
So since Julia uses 1-based indexing... Try passing X
where X[i][1] = i
and X[i][j] = k(i, j - 1)
for j > 1
? Like so
1 k(x_1, x_1) k(x_1, x_2) k(x_1, x_3) k(x_1, x_4)
2 k(x_2, x_1) k(x_2, x_2) k(x_2, x_3) k(x_2, x_4)
3 k(x_3, x_1) k(x_3, x_2) k(x_3, x_3) k(x_3, x_4)
4 k(x_4, x_1) k(x_4, x_2) k(x_4, x_3) k(x_4, x_4)
Could you verify that this is actually correct? If it is, we should definitely consider adding this to our documentation. We could also improve the API by accepting the Gram matrix and automatically prepend the first column of the required indices.
Edit: for prediction, the first column can contain arbitrary values, they will not be used by libsvm.
@barucden Many thanks for this investigation!
@mrussek Is this something you can confirm for us?
I mean check on directly with LIBSVM.jl. If true, I'm sure we could make the MLJ interface a little more intuitive.
I'm trying to use svmtrain
in Julia in that way and it doesn't seem to work.
The following produces a kernel like you described on a synthetic set of data:
using LIBSVM
using LinearAlgebra
X = randn(100, 2)
y = X[:, 1] .> 0.3;
K = zeros(100, 101)
K[:,1] = 1:100
for i in 1:100, j in 1:i
K[i, j+1] = exp(-norm(X[i,:] .- X[j,:])^2)
K[j, i+1] = K[i, j+1]
end
but then this errors out
model = svmtrain(K, y; kernel=Kernel.Precomputed)
with this error
Size of second dimension of training instance matrix
(101) does not match length of labels
(100)
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] indices_and_weights(labels::BitVector, instances::Matrix{Float64}, weights::Nothing)
@ LIBSVM ~/.julia/packages/LIBSVM/mQbZW/src/LIBSVM.jl:234
[3] svmtrain(X::Matrix{Float64}, y::BitVector; svmtype::Type, kernel::LIBSVM.Kernel.KERNEL, degree::Int64, gamma::Float64, coef0::Float64, cost::Float64, nu::Float64, epsilon::Float64, tolerance::Float64, shrinking::Bool, probability::Bool, weights::Nothing, cachesize::Float64, verbose::Bool, nt::Int64)
@ LIBSVM ~/.julia/packages/LIBSVM/mQbZW/src/LIBSVM.jl:344
[4] top-level scope
@ In[56]:14
seeming to indicate that the extra column is a problem.
If we drop it (or even just transpose K), training seems to work:
K = K[:,2:end]
model = svmtrain(K, y; kernel=Kernel.Precomputed)
But inference doesn't seem to work on new examples?
svmpredict(model, randn(2, 10))
Model has 100 but 10 provided
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] svmpredict(model::LIBSVM.SVM{Bool}, X::Matrix{Float64}; nt::Int64)
@ LIBSVM ~/.julia/packages/LIBSVM/mQbZW/src/LIBSVM.jl:383
[3] svmpredict(model::LIBSVM.SVM{Bool}, X::Matrix{Float64})
@ LIBSVM ~/.julia/packages/LIBSVM/mQbZW/src/LIBSVM.jl:378
[4] top-level scope
@ In[91]:1
PS: Just for clarity I'm not actually trying to use an RBF kernel but something more sophisticated, I know libsvm has RBF support
I've looked into it, and AFAIC LIBSVM.jl needs some adjustments to properly support precomputed kernels. I'll try to come up with something.
@ablaom @mrussek It is not as trivial as I original thought. The main issue is with creating the SVMNode
s from the input matrix. In the case of precomputed kernel, our current method does not work as intended.
I think I am on the right track now but suggestions are welcome in #73. So far I have changed only training (not prediction).
@barucden Thank you!
In the original C-code one puts the kernel in a file passed as a command-line argument: https://github.com/cjlin1/libsvm/blob/eedbb147ea79af44f2ecdca1064f2c6a8fe6462d/README#L138
How does one do it for LIBSVM.jl?