Closed luboshanus closed 1 year ago
Just tried it with MLJ, but it uses same setup.
using CSV, DataFrames
dataset = CSV.read(download("http://freakonometrics.free.fr/rent98_00.txt"), DataFrame)
x0 = [ones(size(dataset,1)) Matrix(dataset[!,[:area, :yearc]])]
y0 = dataset[!,:rent_euro]
df_mlj = DataFrames.DataFrame([y0 x0],[:rent_euro, :intc, :area, :yearc])
ym, Xm = unpack(df_mlj, ==(:rent_euro), in([:intc, :area, :yearc])) #; rng=123
using MLJ
QuantileRegressor = @load QuantileRegressor pkg=MLJLinearModels
mach = fit!(machine(QuantileRegressor(;delta=0.3, fit_intercept=false, penalty=:none), Xm, ym))
fitted_params(mach)
Fitted parameters:
julia> fitted_params(mach)
(coefs = [:intc => -0.000598562430985105, :area => 5.566200326965857, :yearc => 0.07844799525059157],
intercept = nothing,)
This is might be no surprise, since these two from MLJ should be the same models.
However, using QuantileRegressions
, the result is same as from LP model and as the one from R
# _ test with QuantileRegressions
using QuantileRegressions
ResultQR = qreg(@formula(rent_euro ~ area + yearc), df_mlj, .3)
output:
julia> ResultQR = qreg(@formula(rent_euro ~ area + yearc), df_mlj, .3)
StatsModels.TableRegressionModel{QuantileRegressions.QRegModel, Matrix{Float64}}
rent_euro ~ 1 + area + yearc
Coefficients:
─────────────────────────────────────────────────────────
Quantile Estimate Std.Error t value
─────────────────────────────────────────────────────────
(Intercept) 0.3 -5542.5 169.111 -32.7743
area 0.3 3.97813 0.0838414 47.4483
yearc 0.3 2.88723 0.085967 33.5854
─────────────────────────────────────────────────────────
Thus, could you help with MLJ quantile regression estimation? Thank you.
Hello!
Thanks for the report. I investigated this a little bit and a core part of the problem seems to be around tau
. I'm running a bit out of time to fully complete this so just writing notes quickly here and will think of a way to fix this later. It seems tau
is on the "wrong side", i.e. setting 0.3
actually amounts to 0.7
and vice versa.
To debug this stuff (explaining here as it might be of interest to you), a good thing is to check the objective
function associated with the regression object:
y = Vector(dataset[!,:rent_euro])
X = Matrix(dataset[!,[:area, :yearc]])
qr = QuantileRegression(0.3; penalty=:none)
qr2 = QuantileRegression(0.7; penalty=:none)
obj_qr = objective(qr, X, y)
obj_qr2 = objective(qr2, X, y)
mlj_beta = [
5.566200326965212,
0.07844799525061612,
-0.0005985624309847961
]
mlj_beta2 = fit(qr2, X, y)
obj_qr(mlj_beta) # 381_972
obj_qr(mlj_beta2) # 226_639
obj_qr2(mlj_beta) # 245_533
obj_qr2(mlj_beta2) # 380_221
this shows that mlj_beta2
is a better fit for obj_qr
than mlj_beta
and vice versa (which doesn't make sense of course). So there's a swap somewhere which needs to be fixed.
As a note, if you compute the same objective functions with the coefs obtained via R you'd get
obj_qr(r_beta) # 207_551
obj_qr2(r_beta) # 354_518
which makes sense. Note that the objective obtained with the MLJLM is still 10% worse than what you get with R
. This may get decreased further once I've fixed the symmetry thing. I'm somewhat less worried about 10% though as the solver used is completely different (at least less worried than the 84% worse we currently get...) though I'd prefer this to be just a few percents (might be able to get there with standardisation, solver parameter tuning etc)
Thanks for your time looking at it. I understand that one should look at the objective, although, the data are quite trivial to solve and MLJ QuantileRegression should provide same results for coefficient even for non-standardised data, right?
I have switched the tau
and just provide here the output. Coefficients are still at different scale. I don't exactly know, what and where things are happening in the routine. Nonetheless, objective function is one thing, the benchmark data/output is the other.)
Hope you will find it easily in the solver/setup of Quantile regression.
tau2 = 0.7
mlj_qreg2 = MLJLinearModels.QuantileRegression(tau2; fit_intercept=true, scale_penalty_with_samples=false, penalty=:none)
mlj_beta2 = MLJLinearModels.fit(mlj_qreg2, Matrix(dataset[!,[:area, :yearc]]), dataset[!,:rent_euro])
Coefs:
julia> mlj_beta2 = MLJLinearModels.fit(mlj_qreg2, Matrix(dataset[!,[:area, :yearc]]), dataset[!,:rent_euro])
3-element Vector{Float64}:
3.2583091596534843
0.07749778937937656
-0.0004959079331218055
to be clear (maybe I wasn't in my previous message) what I reported does show that there is an issue which should be fixed (the tau stuff) in the package. After fixing it you may still get pretty different results and this is not necessarily a bug (especially if the objective function at the two points is roughly the same).
The point about standardising etc, has to do with how similar you can expect the objective function to be after interruption of two different solvers run with their stopping rules. If you have a nice convex function with two perfectly tuned solvers, they will/should converge to the same spot; what I'm saying here is that it almost certainly isn't the case here (even though the function is obviously convex). My point is just that even if everything is set properly with good defaults, you may end up stopping in an area which is not a minima but seems like one to the solver (e.g. a really flat valley). The parameters at that point may be completely different (including very different magnitude) from the ones you'd get with some other method (eg LP here). You get that all the time with neural nets for instance (though there the objective function is of course non convex). Standardising helps the objective function be "nicer" and can help reduce the gap between close-to-optimal points.
Thanks for your comment. I do understand what you write about optimization. I wish this package and algorithm would converge to results close to the LP one since the computation here is much faster than JuMP's LP. Maybe it is the point of different results. I look forward to seeing how the optimization behaves when the thing with tau
is changed.
Thanks anyway!
Ok so the issue with tau is fixed in #148 which is good. I'll merge and do a patch release.
For the comparison of QR here's a couple of notes:
fit_br
and fit_fnp
method but better than fit_lasso
This just confirms the discussion we had earlier: in this instance, it looks like the method stops in a non-optimal zone. In fact passing solver=LBFGS(optim_options=Optim.Options(show_trace=true))
shows that the iterates get stuck. Scaling the covariates by n
(which is probably a good idea anyway) gets you closer (~3.8% worse compared to IP).
What could be interesting is to check how well what you get from either method generalises (e.g. if you split the data 80/20, use 80 to train and predict on 20, how well does it do).
I hope this clarifies what's going on and is a bit helpful! I'll close this now as I don't think there's anything left to fix in the package but feel free to reopen if you still have questions.
Just want to add that I appreciate the time spent by @luboshanus and @tlienart on this issue.
Hi,
I was a bit force to implement/rewrite quantile regression as linear programming (LP) model because results obtained using this package compared to a MATLAB code I had from a paper were not same/similar. Matlab used LP and at first I did not want to code LP in Julia, so I took
MLJLinearModels
with its possibility of L1 penalty for quantile regression. Nonetheless, different results put me to rewrite quantile regression as LP, which I also wanted to compare with thisMLJLinearModels'
simpleNoPenalty
QuantileRegression
.Here is an example from stackoverflow using R and an example using MLJLinearModels: Link: https://stats.stackexchange.com/questions/384909/formulating-quantile-regression-as-linear-programming-problem
This is simple quantile regression, on these data. Not LP model. In R using
quantreg
:In Julia:
It might help with an idea of tests or comparison with
quantreg
. Could you please verify or help, what is the problem with results? Coefficients are at different scale and not only that. :) Thanks