Add `history_additions` to resolve #202 at least for grid searches

dpaetzel commented 5 months ago

This adds the history_additions option to TunedModel which allows to specify a function f which is then called on each set of folds during tuning as f(model, fitted_params_per_fold) where

model is the configuration the tuner is currently looking at and
fitted_params_per_fold is the vector of fitted_params(mach) for each mach trained during resampling (e.g. this has 5 entries if 5-fold CV is used)—see here.

This closes #202 to some extent since when using searches that do not adjust their search space based on their trajectory (e.g. Grid and LatinHypercube), this allows to optimize with respect to functions that are not exclusively predictive performance–based. For example:

using MLJTuning
using MLJ
DTRegressor = @load DecisionTreeRegressor pkg = DecisionTree verbosity = 0
using DecisionTree: DecisionTree

N = 800
X, y = rand(N, 5), rand(N)
X = MLJ.table(X)

model = DTRegressor()

space = [
    range(model, :max_depth; lower=1, upper=5),
    range(
        model,
        :min_samples_split;
        lower=ceil(0.001 * N),
        upper=ceil(0.05 * N),
    ),
]

function histadds(model, fitted_params_per_fold)
    return DecisionTree.depth.(getproperty.(fitted_params_per_fold, :raw_tree))
end

struct HistoryAdditionsSelection <: MLJTuning.SelectionHeuristic end

function MLJTuning.best(::HistoryAdditionsSelection, history)
    # Compute the mean of the depths stored in `history_additions`.
    scores = mean.(getproperty.(history, :history_additions))
    # Within this contrived example, the best hyperparametrization is the one
    # resulting in the least mean depth.
    index_best = argmin(scores)
    return history[index_best]
end

# Let's pirate some types. Julia, please forgive me.
function MLJTuning.supports_heuristic(
    ::LatinHypercube,
    ::HistoryAdditionsSelection,
)
    return true
end

modelt = TunedModel(;
    model=model,
    resampling=CV(; nfolds=3),
    tuning=LatinHypercube(; gens=30),
    range=space,
    measure=mae,
    n=10,
    history_additions=histadds,
    selection_heuristic=HistoryAdditionsSelection(),
)

macht = machine(modelt, X, y)
MLJ.fit!(macht; verbosity=1000)
display(getproperty.(report(macht).history, :history_additions))

ablaom commented 5 months ago

Thanks for this. The sample application is very helpful. This can probably be adapted for documentation.

I think we are being too specific with the signature, and are unnecessarily ruling out some other possible use-cases. Rather than history_additions(model, fitted_params_per_model) can we widen the signature to history_additions(model, E) where E is the PerformanceEvaluation object return by evaluate?

Otherwise, this looks good to me.

codecov[bot] commented 5 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 87.40%. Comparing base (e3293dc) to head (5f4b83d). Report is 17 commits behind head on dev.

:exclamation: Current head 5f4b83d differs from pull request most recent head e2cbcd6. Consider uploading reports for the commit e2cbcd6 to get more accurate results

Additional details and impacted files

```diff @@ Coverage Diff @@ ## dev #205 +/- ## ========================================== + Coverage 86.44% 87.40% +0.96% ========================================== Files 13 13 Lines 649 659 +10 ========================================== + Hits 561 576 +15 + Misses 88 83 -5 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

ablaom commented 5 months ago

Or, we could just insist a history entry:

includes every property that PerformanceEvaluation objects include, or
(breaking) includes a single property evaluation with value the PerformanceEvaluationobject.

Thoughts @dpaetzel ?

dpaetzel commented 4 months ago

includes every property that PerformanceEvaluation objects include

As in copy all properties of E = evaluate(resampling_machine) into the history entry (as opposed to only E.measure, E.measurement and E.per_fold)?

(breaking) includes a single property evaluation with value the PerformanceEvaluation object.

Why would this be breaking? This seems to me to be no more breaking than the first proposal (where we add more than just a single property called evaluation)?

dpaetzel commented 4 months ago

(I'm with you on allowing more than just my use case.)

ablaom commented 4 months ago

As in copy all properties of E = evaluate(resampling_machine) into the history entry (as opposed to only E.measure, E.measurement and E.per_fold)?

yes.

Why would this be breaking? This seems to me to be no more breaking than the first proposal (where we add more than just a single property called evaluation)?

Well, breaking if we remove the existing properties that will now become redundant. How about we go with this option but leave the redundant properties, and I'll add an issue to remove them in the next breaking release?

dpaetzel commented 4 months ago

Well, breaking if we remove the existing properties that will now become redundant.

Got it.

How about we go with this option but leave the redundant properties, and I'll add an issue to remove them in the next breaking release?

I think this is a sensible way to move forward. :+1: I'll update the PR in the next days (sorry for the delay!).

dpaetzel commented 4 months ago

I undid the many small changes and only added the PerformanceEvaluation object as a field evaluation to the history entry. Updated usage example:

import MLJBase: recursive_getproperty
using MLJTuning
using MLJ
DTRegressor = @load DecisionTreeRegressor pkg = DecisionTree verbosity = 0
using DecisionTree: DecisionTree

N = 300
X, y = rand(N, 3), rand(N)
X = MLJ.table(X)

model = DTRegressor()

space = [
    range(model, :max_depth; lower = 1, upper = 5),
    range(model, :min_samples_split; lower = ceil(0.001 * N), upper = ceil(0.05 * N)),
]

struct TreeDepthSelection <: MLJTuning.SelectionHeuristic end

function MLJTuning.best(::TreeDepthSelection, history)
    # Extract the depths of all folds of all history entries.
    fparams = recursive_getproperty.(history, Ref(:(evaluation.fitted_params_per_fold)))
    depths = [DecisionTree.depth.(getproperty.(fparam, :raw_tree)) for fparam in fparams]

    # Compute the mean of the depths stored in `history_additions`.
    scores = mean.(depths)
    # Within this contrived example, the best hyperparametrization is the one
    # resulting in the least mean depth.
    index_best = argmin(scores)
    return history[index_best]
end

function MLJTuning.supports_heuristic(::LatinHypercube, ::TreeDepthSelection)
    return true
end

modelt = TunedModel(;
    model = model,
    resampling = CV(; nfolds = 3),
    tuning = LatinHypercube(; gens = 30),
    range = space,
    measure = mae,
    selection_heuristic = TreeDepthSelection(),
    n = 5,
)

macht = machine(modelt, X, y)
MLJ.fit!(macht; verbosity = 1000)

display(report(macht).history[1].evaluation)

ablaom commented 4 months ago

Thanks @dpaetzel for the help with this. I've fixed the invalidated test in a new PR.

Closing in favour of #210.

JuliaAI / MLJTuning.jl

Add `history_additions` to resolve #202 at least for grid searches #205

Codecov Report