Open azev77 opened 4 years ago
Also a generic learner is a great pedagogical tool b/c it compactly unifies many different ML models. I'd love to co-author a tutorial that gives a non-black box intro to ML.
First consider constant models: ŷ = f(x; θ_f) = θ_f
(1.a) Mean: (L(y,ŷ) = (y-ŷ)^2, f(x)= θ_f, P(f)=0)
(1.b) Median: (L(y,ŷ) = |y-ŷ|, f(x)= θ_f, P(f)=0)
(1.c) Quantile: (L(y,ŷ; θ_L) = Check(), f(x)= θ_f, P(f)=0). Where θ_L is the quantile
.
Next consider linear (in the params) models: ŷ = f(x; θ_f) = θ_f*x
(2.a) OLS: (L(y,ŷ) = (y-ŷ)^2, f(x)= θ_f*x
, P(f)=0)
(2.b) Lasso: (L(y,ŷ) = (y-ŷ)^2, f(x)= θ_f*x
, P(f)=L1())
(2.c) Ridge: (L(y,ŷ) = (y-ŷ)^2, f(x)= θ_f*x
, P(f)=L2())
(2.d) LAD: (L(y,ŷ) = |y-ŷ|, f(x)= θ_f*x
, P(f)=0)
(2.e) Quantile Reg: (L(y,ŷ) = Check(), f(x)= θ_f*x
, P(f)=0)
Some models allow infinite dimensional hyper-parameters. Eg 1: XGBoost allows custom loss functions. (As does @xiaodaigh's JLBoost.jl) Eg 2: @joshday's SparseRegression.jl allows custom loss & custom penalty
Eg 3: @rakeshvar's AnyBoost.jl allows custom Loss/Activation/Constraint
An oft advertised feature of "doing ML in Julia" is how easy it is to customize traditional models: (Julia computing) & (Discourse)
Is it possible to make it easy to identify which models in MLJ allow infinite dim HP (custom Loss/Penalty/CEF) For example:
models(x -> x.is_supervised && x.is_pure_julia && x.custom_penalty)
lists all supervised models written in pure julia which allow custom penalties...It would be awesome to automate tuning of infinite dimensional HP The Julia computing example adds weight parameter
w
to the logistic loss: It would be cool to make it easy tunew
over a grid in MLJ...At some point, it would be magical to include a model in MLJ called
GenericLearner
w/ 3 infinite dimensional inputs: (L(y,ŷ), f(x), P(f)) where: L(y, ŷ; θ_L): the Loss function, in the above exampleθ_L = w
ŷ = f(x; θ_f): the Model, in SparseRegression.jl it is alwaysf(x; θ_f)= θ_f*x
P(f();θ_P): the Penalty, L1/MCP what suits your fancy as long as the opt problem isnice
Then the user chooses her fav algorithm to solve: ŷ = arginf Q():= L(y, ŷ; θ_L) + P(f) ŷ = arginf Q():= L(y, f(x; θ_f); θ_L) + P(θ_f; θ_P) I realize this may be insanely hard to solve unless the user makes judicious choices of (L, f, P).A good student will see that this is "just" a constrained optimization problem w/ objective
L(y,ŷ)
and constraintP(f)
. Some joke that this is "All of ML in one expression". It's not but it is a powerful abstraction. UPDATE: EmpiricalRisks.jl by @lindahua does exactly this!@rakeshvar put it nicely when discussing JuML.jl: