Closed hghayoom closed 1 year ago
The MOO model is actually: https://github.com/odow/MOO. That model doesn't use SDDP. The decision-hazard was a two-stage stochastic program, in which we first compute the stocking rate, and then observe the price. The hazard-decision solves a different instance of the model for each price.
There is no tooling in SDDP.jl for explicitly modeling decision-hazard problems. Instead, you need to model the problem in which the decision is a state variable that is decided in the previous stage.
So in the HD example, the order_quantity
is a control variable:
https://github.com/odow/SDDP.jl/blob/09c8b6da7389b5b866e26ccf3a611a677ad5c8a7/papers/policy_graph/inventory_management.jl#L21
and in the DH example, the order_quantity
is a state variable:
https://github.com/odow/SDDP.jl/blob/09c8b6da7389b5b866e26ccf3a611a677ad5c8a7/papers/policy_graph/inventory_management.jl#L68
and we use the incoming value in the subproblem:
https://github.com/odow/SDDP.jl/blob/09c8b6da7389b5b866e26ccf3a611a677ad5c8a7/papers/policy_graph/inventory_management.jl#L84
Take a look at
Thanks for your response. It was very insightful. I have some other questions about this inventory management example. I am using Visual studio code and I run the model in debug mode. then I can see the final variables in the variable window
After running this example I see 500 results( because we simulate it 500 times) but within each simulation, there are different numbers of stages( for example one of them is only one stage and the other one contains 3 stages.). Are there any convergence criteria for that leading to different stages?( see the picture)
In DH problem, I cannot save the outputs in results of simulation : results = SDDP.simulate(model, 500,[:state, :order_quantity,:lost_demand,:disposed_units,:backordered_units,:held_units,:demand,],) therefor, after I moved the definition of these variables to the beginning, I could save them. I moved these : https://github.com/odow/SDDP.jl/blob/09c8b6da7389b5b866e26ccf3a611a677ad5c8a7/papers/policy_graph/inventory_management.jl#L73 to Here: https://github.com/odow/SDDP.jl/blob/09c8b6da7389b5b866e26ccf3a611a677ad5c8a7/papers/policy_graph/inventory_management.jl#L68
In DH problem, for odd stages (e.g 1,3,5,...) demand and other parameters are zero. I thought the reason is due to defining a decision node at the beginning of DH policy graph here: https://github.com/odow/SDDP.jl/blob/09c8b6da7389b5b866e26ccf3a611a677ad5c8a7/papers/policy_graph/inventory_management.jl#L55 and we are somehow expanding stages and each two consequential nodes are representative of a time step. Results of even nodes ( like 2):
Results of odd nodes ( like 3):
Are there any convergence criteria for that leading to different stages?( see the picture)
If your graph is cyclic, then there will be different numbers of stages. This ensures that the sample mean of the cost for each scenario is an unbiased estimator for the objective value of the problem.
You can simulate to a fixed depth with:
SDDP.simulate(
model,
500;
sampling_scheme = SDDP.InSampleMonteCarlo(;
max_depth::Int = 5,
terminate_on_dummy_leaf= false,
),
)
In DH problem, I cannot save the outputs in results of simulation
Try
SDDP.simulate(model, 500; skip_undefined_variables = true)
In DH problem, for odd stages (e.g 1,3,5,...) demand and other parameters are zero. I thought the reason is due to defining a decision node at the beginning of DH policy graph here: and we are somehow expanding stages and each two consequential nodes are representative of a time step.
This follows from the definition of our graph:
Take a look at the :node_index
fields of each simulation. They'll go decision
, recourse
, decision
, ...
You don't actually need the two nodes though. You could do this (untested so maybe typos, but you get the idea):
function infinite_lin_DH()
graph = SDDP.LinearGraph(1)
SDDP.add_edge(graph, 1 => 1, 0.95)
return SDDP.PolicyGraph(
graph;
sense = :Min,
lower_bound = 0.0,
optimizer = HiGHS.Optimizer,
) do subproblem, node
@variable(subproblem, -10 <= state <= 10, SDDP.State, initial_value = 0)
@variable(subproblem, 0 <= order_quantity, SDDP.State, initial_value = 0)
@variables(subproblem, begin
lost_demand >= 0
disposed_units >= 0
backordered_units >= 0
demand
end)
@constraints(subproblem, begin
state.out == state.in + order_quantity.in - demand + lost_demand - disposed_units
backordered_units >= -state.out
end)
Pg = rand(Distributions.TruncatedNormal(5, 2, 0, 10), 50)
sort!(Pg)
SDDP.parameterize(ω -> JuMP.fix(demand, ω), subproblem, Pg)
@stageobjective(
subproblem,
20 * order_quantity.out + # Order cost
2 * state.out + # Holding cost ch.
10 * backordered_units + # Backorder cost cb.
10 * disposed_units + # Disposal cost cd.
100 * lost_demand # Lost demand cost cl.
)
end
end
DH vs HD is a modeling choice. It doesn't change anything about the algorithm or require special structures.
Thank you Oscar for the illustrations. they help me a lot for shaping my codes. I wrote a code that compares two problems of HD and DH in four steps. I need to learn them to use in my work. HD problem works properly. But for DH problem I get infeasibility issue that you described here. but I don't know why I should get this error. I have defined positive and negative slack variables (+Borrow and -s) so that the constraint should be feasible all times. But I don't know why this happens.
using SDDP, HiGHS, Test, GLPK,Distributions,Random,Statistics
function finite_lin_DH()
graph2 = SDDP.Graph(
:root_node,
# we have 4 time steps and for each one we need a decicion node and a resource node
[:decision1, :recourse1,:decision2, :recourse2,:decision3, :recourse3,:decision4, :recourse4],
[
(:root_node => :decision1, 1.0),
(:decision1 => :recourse1, 1.0),
(:recourse1 => :decision2, 1.0),
(:decision2 => :recourse2, 1.0),
(:recourse2 => :decision3, 1.0),
(:decision3 => :recourse3, 1.0),
(:recourse3 => :decision4, 1.0),
(:decision4 => :recourse4, 1.0),
# (:recourse4 => :decision1, 0.95),
],
)
dd=1
model = SDDP.PolicyGraph(
graph2,
# bellman_function = SDDP.BellmanFunction(lower_bound = 0),
# optimizer = GLPK.Optimizer,
sense = :Min,
lower_bound = 0.0,
optimizer = HiGHS.Optimizer,
) do sp, node
@variable(sp, 0 <= x <= 8, SDDP.State, initial_value = 7)# reservoir water level in each step
@variable(sp, 0 <= p , SDDP.State, initial_value = 2) # producing power with cost
@variables(sp, begin
y >= 0 # using reservoir water
# p >= 0 #Power Generation
s>=0 # spill water
ξ # input
Borow>=0#Borrow water if water level is negative
end)
if node in [:decision1 , :decision2 ,:decision3 ,:decision4 ]
@constraint(sp, x.out == x.in)
@stageobjective(sp, 20 * p.out)
else
@constraints(sp, begin
p.in + y == 9
x.out == x.in - y + ξ - s+ Borow
end)
Pg=rand([2, 3],4)
# Pg = rand(Distributions.TruncatedNormal(5, 2, 0, 10), 4)
# sort!(Pg)
SDDP.parameterize(sp, Pg) do ω
JuMP.fix(ξ, ω)
end
@stageobjective(sp, 200 * s+500*Borow)
end
end
return model
end
function finite_lin_HD()
dd=1
model = SDDP.PolicyGraph(
# this time we define a line graph
SDDP.LinearGraph(4),
sense = :Min,
lower_bound = 0.0,
optimizer = HiGHS.Optimizer,
) do sp, node
@variable(sp, 0 <= x <= 8, SDDP.State, initial_value = 7)# reservoir water level in each step
# @variable(sp, 0 <= p , SDDP.State, initial_value = 2) # producing power with cost
@variables(sp, begin
y >= 0 # using reservoir water
p >= 0 #Power Generation
s>=0 # spill water
ξ # input
Borow>=0#Borrow water if water level is negative
end)
# if node in [:decision1 , :decision2 ,:decision3 ,:decision4 ]
# @constraint(sp, x.out == x.in)
# @stageobjective(sp, 20 * p.out)
# else
@constraints(sp, begin
p + y == 9
x.out == x.in - y + ξ - s+ Borow
end)
Pg=rand([2, 3],4)
# Pg = rand(Distributions.TruncatedNormal(5, 2, 0, 10), 4)
# sort!(Pg)
SDDP.parameterize(sp, Pg) do ω
JuMP.fix(ξ, ω)
end
@stageobjective(sp, 200 * s+500*Borow+20 * p)
# end
end
return model
end
a=1
Random.seed!(1234)
begin
model = infinite_lin_DH()
SDDP.train(model, iteration_limit = 30, print_level = 1)
results = SDDP.simulate(model, 100, [:y,:p,:ξ,:x,:s,:Borow],)
objectives = [sum(s[:stage_objective] for s in simulation) for simulation in results]
sample_mean = round(Statistics.mean(objectives); digits = 2)
sample_ci = round(1.96 * Statistics.std(objectives) / sqrt(500); digits = 2)
println("DH Confidence_interval = $(sample_mean) ± $(sample_ci)")
end
Random.seed!(1234)
begin
model = infinite_lin_HD()
SDDP.train(model, iteration_limit = 30, print_level = 1)
results = SDDP.simulate(model, 100, [:y,:p,:ξ,:x,:s,:Borow],)
objectives = [sum(s[:stage_objective] for s in simulation) for simulation in results]
sample_mean = round(Statistics.mean(objectives); digits = 2)
sample_ci = round(1.96 * Statistics.std(objectives) / sqrt(500); digits = 2)
println("HD Confidence_interval = $(sample_mean) ± $(sample_ci)")
end
Here's how I would write your code.
Some tips:
p
that is greater than 9
. You have two options: add constraints so that p
must be less than the demand in the next stage, or allow total supply to exceed demand. (Or add a new demand source that only operates when the price is 0.)julia> using SDDP, HiGHS, Random, Statistics
julia> function finite_lin_DH()
return SDDP.LinearPolicyGraph(;
stages = 4,
sense = :Min,
lower_bound = 0.0,
optimizer = HiGHS.Optimizer,
) do sp, node
@variables(sp, begin
0 <= x <= 8, SDDP.State, (initial_value = 7)
p >= 0, SDDP.State, (initial_value = 2)
y >= 0
s >= 0
ξ
borrow >= 0
end)
@constraints(sp, begin
# This constraint is wrong, because p.in could be > 9 already.
# p.in + y == 9
p.in + y >= 9
x.out == x.in - y + ξ - s + borrow
end)
SDDP.parameterize(sp, rand([2, 3], 4)) do ω
JuMP.fix(ξ, ω)
end
@stageobjective(sp, 200s + 500borrow + 20p.out)
end
return model
end
finite_lin_DH (generic function with 1 method)
julia> function finite_lin_HD()
model = SDDP.LinearPolicyGraph(
stages = 4,
sense = :Min,
lower_bound = 0.0,
optimizer = HiGHS.Optimizer,
) do sp, node
@variables(sp, begin
0 <= x <= 8, (SDDP.State, initial_value = 7)
p >= 0
y >= 0
s >= 0
ξ
borrow >= 0
end)
@constraints(sp, begin
p + y == 9
x.out == x.in - y + ξ - s + borrow
end)
SDDP.parameterize(sp, rand([2, 3],4)) do ω
JuMP.fix(ξ, ω)
end
@stageobjective(sp, 200s + 500borrow + 20p)
end
return model
end
finite_lin_HD (generic function with 1 method)
julia> function train_and_compute_ci(model; replications = 100)
SDDP.train(model; iteration_limit = 30, print_level = 0)
results = SDDP.simulate(model, replications)
objectives = [
sum(s[:stage_objective] for s in simulation) for simulation in results
]
sample_mean = round(Statistics.mean(objectives); digits = 2)
sample_ci = round(1.96 * Statistics.std(objectives) / sqrt(replications); digits = 2)
println("Confidence_interval = $(sample_mean) ± $(sample_ci)")
return
end
train_and_compute_ci (generic function with 1 method)
julia> Random.seed!(1234)
MersenneTwister(1234)
julia> begin
model = finite_lin_DH()
train_and_compute_ci(model)
end
Confidence_interval = 356.0 ± 2.67
julia> Random.seed!(1234)
MersenneTwister(1234)
julia> begin
model = finite_lin_HD()
train_and_compute_ci(model)
end
Confidence_interval = 379.6 ± 3.15
This shows that being able to adapt your decision to the realization is worth ~$24.
Thanks, Oscar. That is wonderful. Understanding both DH and HD is very important because we can calculate EVPI. I suggest you add this example to your example. I learned a loooot from this.
Thanks again
Hi Oscar, I am working with your amazing tool and I want to see some examples for comparing DH and HD problems. I found this Inventory Problem which compares these two approaches. But it is a little different from what I expected. I need some examples without a "discount factor=0.95". I know that in your dissertation for MOO, you have compared result of DH and HD, But I cannot see how you calculated DH results I think New version of this model is Here I'm wondering if there are other examples for DH? because a lot of times we are not aware of the uncertainties that might happen for us.
Thanks