Thuener commented 3 weeks ago

Hi Oscar,

now with SDDiP, we actually have a heuristic to solve the model. We don't proved convergence anymore. I can see that on several occasions, the model is stuck on a local optimal. I have been using BanditDuality to help the model to get out of those local optimal. Using different cuts helps get out of the local optimal. However, StrengthenedConicDuality can make much slower progress.

Any ideas on how to get out of the local optimal? Is there any way to use different duality_handlers depending on the bound stalling?

PS: I also get an issue with BanditDuality when trying to start back the train, it gets stuck with just one type of duality. Example:

Starting back training as I don't think we achieved convergence:

Then I'm stuck in local optimal and the StrengthenedConicDuality is never used again.

Thuener commented 3 weeks ago

I have changed the BanditDuality code for this strategy to change the duality handler if the bound stops moving. It would be something like this:

mutable struct _Stab{T}

mutable struct StabDuality <: SDDP.AbstractDualityHandler

    function StabDuality(args::SDDP.AbstractDualityHandler...)
        return new(_Stab[_Stab(arg) for arg in args], 1)

function Base.show(io::IO, handler::StabDuality)
    print(io, "StabDuality with multiple dual_handlers:")
    for dual_handler in handler.dual_handlers
        print(io, "\n * ", dual_handler.handler)

function StabDuality()
    return StabDuality(SDDP.ContinuousConicDuality(), SDDP.StrengthenedConicDuality(), SDDP.LagrangianDuality())

function SDDP.prepare_backward_pass(
    if length(options.log) > 2 && isapprox(options.log[end-1].bound, options.log[end].bound; atol = 1e-6)
        handler.last_dual_handlers_index = handler.last_dual_handlers_index % length(handler.dual_handlers) +1
    return SDDP.prepare_backward_pass(node, handler.dual_handlers[handler.last_dual_handlers_index].handler, options)

function SDDP.get_dual_solution(node::SDDP.Node, handler::StabDuality)
    return SDDP.get_dual_solution(node, handler.dual_handlers[handler.last_dual_handlers_index].handler)

function SDDP.duality_log_key(handler::StabDuality)
    return SDDP.duality_log_key(handler.dual_handlers[handler.last_dual_handlers_index].handler)
Thuener commented 3 weeks ago

For the case when restarting the train gets stuck...

The issue with the BanditDuality seems to be on this line: https://github.com/odow/SDDP.jl/blob/d2495d6a9a0886048d560df7e29c7b220656594c/src/plugins/duality_handlers.jl#L380

ContinuousConicDuality doesn't improve the bound; thus const_bound is true, then the train is stuck with ContinuousConicDuality. Maybe some rand choice on the arms?

odow commented 3 weeks ago

Any ideas on how to get out of the local optimal?


Is there any way to use different duality_handlers depending on the bound stalling?

Nope. We'd need to write a new duality handler.

PRs to improve Bandit welcome :smile: I vaguely remember issues related to the const_bound. But it was a while ago...

Thuener commented 3 weeks ago

I will create a PR in the future for this. Right now, I'm just testing it. Sometimes when changing the dual_handler I'm getting:

    Termination status : OPTIMAL
    Primal status      : FEASIBLE_POINT
    Dual status        : NO_SOLUTION.

Any ideas on why?

Answering my own question... It is because we can't change the dual_handler just for some nodes, thus we have to change it only on the beginning of the backwards.

Thuener commented 3 weeks ago

I achieved very good results by updating the BanditDuality with a simple solution.

It does BanditDuality typically, but if it gets 2 (parameter convergence_min_logs) iterations with the same bound it changes for the next arm. This udpate gives more diversity and helps with the issue when you read the cuts and stay stuck with only one arm.

mutable struct BanditDuality <: AbstractDualityHandler

    function BanditDuality(args::AbstractDualityHandler...)
        return new(_BanditArm[_BanditArm(arg, Float64[]) for arg in args], 1, 1, 10, 2)

function prepare_backward_pass(
    # Just change once at each backward iteration
    if length(options.log) > handler.logs_seen
        _update_rewards(handler, options.log)
        handler.logs_seen = length(options.log)

        # Check for bound convergence stalling and change the handler if that is the case
        if length(options.log) > handler.convergence_min_logs && 
                isapprox(options.log[end].bound, options.log[end-handler.convergence_min_logs].bound; atol = 1e-6)
            index = (handler.last_arm_index % length(handler.arms)) +1 # next handler
            @info "Change dual handler from $(handler.last_arm_index) to $index"
            handler.last_arm_index = index
            arm = handler.arms[index]
            arm = _choose_best_arm(handler)
        arm = handler.arms[handler.last_arm_index]

    return prepare_backward_pass(node, arm.handler, options)
odow commented 3 weeks ago

iterations with the same bound it changes for the next arm

Oh that's a good idea

odow commented 3 weeks ago

I'm going to bed, but I sent you and invite to join as a collaborator. It should give you write access to edit my existing PR.

odow commented 3 weeks ago

Does #779 close this issue?

Thuener commented 2 weeks ago

Sorry, I will check, give me some days.

Thuener commented 2 weeks ago

Yes, I can confirm that the PR solved the issue with bound stalling. Thanks @odow!