Setting domains for hyperparameters

tlienart / AnalyticalEngine.jl

[draft] Agnostic Machine Learning models working on CPUs, GPUs, distributed architecture, etc.

Other

6 stars 1 forks source link

@fkiraly hinted at this and I think it's a nice idea basically have "acceptable domains" for hyperparameters.

For example, we wouldn't really want a penalty to have negative scale so for that, there's a positivity constraint.

I'm not quite sure yet how we could make this appear nicely in the API. Basically for any model you'd have to

specify the hyperparameters (see hyperparameters function in GLR which returns the symbols of the fields that can be mutated)
for each parameter store some kind of condition to check validity

One way could be to have the hyperparameters function return a dictionary instead of a tuple of symbols where the value of the dictionary are conditions such as

val -> 0 <= val < Inf
val -> val ∈ [val1, val2, val3, ...]

and then modify the set! function (see https://github.com/tlienart/AnalyticalEngine.jl/blob/master/src/supervised/sm-utils.jl) with something like

function set!(model::SupervisedModel; kwargs...)
    # retrieve the symbols corresponding to hyperparameters
    hp = hyperparameters(model)
        hp_names = keys(hp)
    for pair ∈ kwargs
        symbol = pair[1]
        @assert symbol ∈ hp_names "Unrecognised hyperparameter $symbol"
        value = pair[2]
                @assert hp[symbol](value) "Given parameter $symbol out of range"
        eval(:($model.$symbol = $value))
    end
    return model
end

Mlr solves this with a dictionary type construction, having fixed fields say for type (continuous?), upper and lower bound, etc.

One additional question which this raises is whether one should distinguish between:

Recommended ranges, e.g., defaults for a single call, grids for grid-tuning, or ranges for likelihood based tuning, and
Exclusion ranges, e.g., sets of values among which the parameter must be taken (otherwise the method breaks)

An issue is that while an exclusion range is attached to the method itself, recommended ranges (except the single call default) relate to meta-methodology which is applied only potentially, thus from an API perspective it is slightly weird to leave it with the original learner – though not entirely unjustifiable as grid-tuning is fairly standard, and “sensible defaults” is an important API design principle.

On the other hand, parameters can be “set” by a variety of tuning methods, so I wonder what the right language for selecting a “solver”/”tuner” would be. I still don’t think that putting this as the parameter of the type is a good idea, but I’m not sure whether or how to set this as a generic interface point.

tlienart / AnalyticalEngine.jl

Setting domains for hyperparameters #11