Issues on using IterationControl.jl for iterative solvers

liuyxpp commented 3 years ago

I am trying to use IterationControl.jl for my iterative solvers. Unlike the machine learning model, there is no distinction between in-sample loss and out-of-sample loss. We only have one kind of loss which is the residual of the iterative solvers. My current approach is using a combination of Step(1) and other info and stopping controls. My question are

What is Step(n) with n>1 for iterative solvers? Can it bring some new advantages?
It is useful for an iterative solver to only check the loss for every n>1 iteration rather than for each iteration to avoid noisy stopping signals.
PQ stopping control is very interesting. But it is not clear how to utilize this control for iterative solvers. What is training_losses? And how to define it?

BTW, I can not find an example of how to define training_losses for machine learning models either inside the example folder.

ablaom commented 3 years ago

Thanks for your query.

As in the readme example, one assumes there is a notion of iterating your model for "n more iterations". If you implement IterationControl.train!(model, n), then it is assumed that calling IterationControl.train!(model, 4), for example, will instigate "4 more iterations" of training. If you instead call IterationControl(model, controls=[Step(4), ...]), then you ensure all other controls are only applied once every 4 iterations. In other words, calling Step(4) is equivalent to calling Step(1) and wrapping all other controls in IterationControl.skip(_, predicate=4). The advantage is mainly one of convenience, although you may also avoid some overheads with function calls and stopping/starting iteration of your model (eg, moving data on and off a GPU).
I don't know if this is always a good way to avoid the noise. It might be in your use-case. It may also be more effective to adjust the parameters of one of the standard stopping criterion (eg, increase n in Patience(n)).
If you have only one kind of "loss", then I expect there's no point in implementing "IterationControl.training_losses" and PQ is probably is not useful to you. The way PQ works is that it modifies the GL (generalization loss) according to how far training has "progressed". In this way, we don't wait for too great an increase in the generalization loss (which typically turns north for over-fitting machine learning models like deep neural nets) when we know the learned parameters are hardly changing anymore. This "progress" is defined using the training loss, which is typically the objective function the iterative model is minimising. When the training loss has plateaued, the "progress" is high. I think PQ makes the most sense in the context where you are getting training losses every iteration for free (ie, as a byproduct of training) but computing out-of-sample losses is an extra cost. For a more authoritive description, see the Prechelt paper.

Does this help?

BTW, I can not find an example of how to define training_losses for machine learning models either inside the example folder.

It's a little contrived, but there is one here: https://github.com/JuliaAI/IterationControl.jl/blob/f90a90808284fc39517a3d015589086c7e181e90/examples/square_rooter/square_rooter.jl#L27 . I will add an explicit link from the readme, thanks!

In MLJ iterative models can buy-into PQ by implementing MLJModelnterface.training_losses. For now only MLJFlux models and TunedModel(wrapper for model hyper-parameter optimisation) implement this.

ablaom commented 3 years ago

[Comment moved to #38]

JuliaAI / IterationControl.jl

Issues on using IterationControl.jl for iterative solvers #41