This adds support for regression models. However, the models produced by liblinear do not seem to be very good.
For example, the following example in scikit-learn:
from sklearn.svm import LinearSVR
import numpy as np
X = np.random.rand(10000, 1)
y = (2 * X)[:, 0]
m = LinearSVR(loss='squared_epsilon_insensitive', dual=False, verbose=1, fit_intercept=False)
m.fit(X, y)
print(m.coef_)
prints
iter 1 act 1.332e+04 pre 1.332e+04 delta 2.000e+00 f 1.332e+04 |g| 1.332e+04 CG 1
[LibLinear][1.99969976]
(meaning the linear coefficient was found pretty accurately)
Whereas, the following code
using LIBLINEAR
X = rand(1, 10000)
y = vec(2 .* X)
m = linear_train(y, X, solver_type=LIBLINEAR.L2R_L2LOSS_SVR, verbose=true)
println(m.w)
prints
init f 3.334e+11 |g| 4.989e+07
iter 1 f 1.462e+11 |g| 1.004e+03 CG 2 step_size 1.00e+00
[7503.24992010474]
with the current PR (meaning totally inaccurate linear coefficient). According to my investigation, this is what is indeed returned by liblinear. Scikit seems to use a different solver than liblinear, but I am not sure if that's the only issue.
Also: linear_predict is not really type-stable as the output type depends on solver_type. For one-class SVM, the output is a pair of Vector{String} and Vector{Float64}. For regression models, it is Vector{Float64} and Vector{Float64} (I made it to return the same vector twice). For other models, it is Vector{typeof(labels)} and Vector{Float64}.
Fixes #37
This adds support for regression models. However, the models produced by liblinear do not seem to be very good.
For example, the following example in scikit-learn:
prints
(meaning the linear coefficient was found pretty accurately)
Whereas, the following code
prints
with the current PR (meaning totally inaccurate linear coefficient). According to my investigation, this is what is indeed returned by liblinear. Scikit seems to use a different solver than liblinear, but I am not sure if that's the only issue.
Also:
linear_predict
is not really type-stable as the output type depends onsolver_type
. For one-class SVM, the output is a pair ofVector{String}
andVector{Float64}
. For regression models, it isVector{Float64}
andVector{Float64}
(I made it to return the same vector twice). For other models, it isVector{typeof(labels)}
andVector{Float64}
.