Closed Thuener closed 2 weeks ago
I have changed the BanditDuality code for this strategy to change the duality handler if the bound stops moving. It would be something like this:
mutable struct _Stab{T}
handler::T
end
mutable struct StabDuality <: SDDP.AbstractDualityHandler
dual_handlers::Vector{_Stab}
last_dual_handlers_index::Int
function StabDuality(args::SDDP.AbstractDualityHandler...)
return new(_Stab[_Stab(arg) for arg in args], 1)
end
end
function Base.show(io::IO, handler::StabDuality)
print(io, "StabDuality with multiple dual_handlers:")
for dual_handler in handler.dual_handlers
print(io, "\n * ", dual_handler.handler)
end
return
end
function StabDuality()
return StabDuality(SDDP.ContinuousConicDuality(), SDDP.StrengthenedConicDuality(), SDDP.LagrangianDuality())
end
function SDDP.prepare_backward_pass(
node::SDDP.Node,
handler::StabDuality,
options::SDDP.Options,
)
if length(options.log) > 2 && isapprox(options.log[end-1].bound, options.log[end].bound; atol = 1e-6)
handler.last_dual_handlers_index = handler.last_dual_handlers_index % length(handler.dual_handlers) +1
end
return SDDP.prepare_backward_pass(node, handler.dual_handlers[handler.last_dual_handlers_index].handler, options)
end
function SDDP.get_dual_solution(node::SDDP.Node, handler::StabDuality)
return SDDP.get_dual_solution(node, handler.dual_handlers[handler.last_dual_handlers_index].handler)
end
function SDDP.duality_log_key(handler::StabDuality)
return SDDP.duality_log_key(handler.dual_handlers[handler.last_dual_handlers_index].handler)
end
For the case when restarting the train gets stuck...
The issue with the BanditDuality seems to be on this line: https://github.com/odow/SDDP.jl/blob/d2495d6a9a0886048d560df7e29c7b220656594c/src/plugins/duality_handlers.jl#L380
ContinuousConicDuality doesn't improve the bound; thus const_bound is true, then the train is stuck with ContinuousConicDuality. Maybe some rand choice on the arms?
Any ideas on how to get out of the local optimal?
Nope.
Is there any way to use different duality_handlers depending on the bound stalling?
Nope. We'd need to write a new duality handler.
PRs to improve Bandit welcome :smile: I vaguely remember issues related to the const_bound
. But it was a while ago...
I will create a PR in the future for this. Right now, I'm just testing it. Sometimes when changing the dual_handler I'm getting:
Termination status : OPTIMAL
Primal status : FEASIBLE_POINT
Dual status : NO_SOLUTION.
Any ideas on why?
Answering my own question... It is because we can't change the dual_handler just for some nodes, thus we have to change it only on the beginning of the backwards.
I achieved very good results by updating the BanditDuality with a simple solution.
1738 -5.260732e+04 -3.734935e+02 5.905173e+04 717890 1
1739S -1.735529e+04 -3.737250e+02 5.918178e+04 718209 1
1741S -1.561158e+04 -3.737984e+02 5.927766e+04 719023 1
1743S -1.144220e+04 -3.738277e+02 5.931456e+04 719313 1
1744S -2.994307e+04 -3.785066e+02 5.957528e+04 719952 1
1746S -1.733834e+04 -3.786535e+02 5.967713e+04 720326 1
1747 -4.087047e+04 -3.786535e+02 5.971645e+04 721469 1
1748S -5.050328e+04 -3.788257e+02 6.018140e+04 722632 1
-------------------------------------------------------------------
status : time_limit
total time (s) : 6.018140e+04
total solves : 722632
best bound : -3.788257e+02
simulation ci : -1.380501e+05 ± 1.543136e+04
numeric issues : 0
-------------------------------------------------------------------
It does BanditDuality typically, but if it gets 2 (parameter convergence_min_logs) iterations with the same bound it changes for the next arm. This udpate gives more diversity and helps with the issue when you read the cuts and stay stuck with only one arm.
mutable struct BanditDuality <: AbstractDualityHandler
arms::Vector{_BanditArm}
last_arm_index::Int
logs_seen::Int
update_min_logs::Int
convergence_min_logs::Int
function BanditDuality(args::AbstractDualityHandler...)
return new(_BanditArm[_BanditArm(arg, Float64[]) for arg in args], 1, 1, 10, 2)
end
end
function prepare_backward_pass(
node::Node,
handler::BanditDuality,
options::Options,
)
# Just change once at each backward iteration
if length(options.log) > handler.logs_seen
_update_rewards(handler, options.log)
handler.logs_seen = length(options.log)
# Check for bound convergence stalling and change the handler if that is the case
if length(options.log) > handler.convergence_min_logs &&
isapprox(options.log[end].bound, options.log[end-handler.convergence_min_logs].bound; atol = 1e-6)
index = (handler.last_arm_index % length(handler.arms)) +1 # next handler
@info "Change dual handler from $(handler.last_arm_index) to $index"
handler.last_arm_index = index
arm = handler.arms[index]
else
arm = _choose_best_arm(handler)
end
else
arm = handler.arms[handler.last_arm_index]
end
return prepare_backward_pass(node, arm.handler, options)
end
iterations with the same bound it changes for the next arm
Oh that's a good idea
I'm going to bed, but I sent you and invite to join as a collaborator. It should give you write access to edit my existing PR.
Does #779 close this issue?
Sorry, I will check, give me some days.
Yes, I can confirm that the PR solved the issue with bound stalling. Thanks @odow!
Hi Oscar,
now with SDDiP, we actually have a heuristic to solve the model. We don't proved convergence anymore. I can see that on several occasions, the model is stuck on a local optimal. I have been using BanditDuality to help the model to get out of those local optimal. Using different cuts helps get out of the local optimal. However, StrengthenedConicDuality can make much slower progress.
Any ideas on how to get out of the local optimal? Is there any way to use different duality_handlers depending on the bound stalling?
PS: I also get an issue with BanditDuality when trying to start back the train, it gets stuck with just one type of duality. Example:
Starting back training as I don't think we achieved convergence:
Then I'm stuck in local optimal and the StrengthenedConicDuality is never used again.