odow / SDDP.jl

A JuMP extension for Stochastic Dual Dynamic Programming
https://sddp.dev
Other
309 stars 61 forks source link

On accessing control variables from previous stage #400

Closed bonnkleiford closed 3 years ago

bonnkleiford commented 3 years ago

Hi again, Oscar!

I have another query but I wanted to make another Issue since users might also be interested on it.

Anyway, I am currently modelling a certain problem where I have a control variable buy. This control variable will be created for every stage until the end of the planning horizon. Now, let's say I am in stage=10. If I make a constraint, the control variable buy will have the time index of 10 (therefore, buy[t=10]. What if, in this stage, I want to use buy[t=1] and not buy[t=10]? What can I do?

Just like this: A[t=10] - B[t=10] <= buy[t=1]

I tried putting in another index [i=1:T] when I create the control variable buy but it now becomes buy[t=1=T,i=1:T] but this is not really a good idea. Do you have any idea on how I can do this on SDDP.jl?

Thanks a lot of always the abrupt help.

BIKEY

odow commented 3 years ago

Thanks for making an issue. This does come up a lot, and the documentation could be more explicit.

Rule: If you want to use a variable from a previous stage, it must be a state variable.

If you want to use the first-stage (and only the first-stage) decision later on:

SDDP.LinearPolicyGraph(
    stages = 10,
) do sp, t
    @variable(sp, x >= 0, SDDP.State, initial_value = 0)
    @variable(sp, buy)
    @variable(sp, first_stage_buy, SDDP.State, initial_value = 0)
    if t == 1
        @constraint(sp, first_stage_buy.out == buy)
    else
        @constraint(sp, first_stage_buy.out == first_stage_buy.in)
    end
    if t == 10
        @constraint(sp, A - B <= first_stage_buy.out)
    end
end

If you want to use the buy variables from 1...t-1, then you need something like:

SDDP.LinearPolicyGraph(
    stages = 10,
) do sp, t
    @variable(model, x >= 0, SDDP.State, initial_value = 0)
    @variable(model, buy)
    @variable(model, old_buys[1:10], SDDP.State, initial_value = 0)
    for i = 1:10
        if i < t
            @constraint(model, old_buys[i].out == old_buys[i].in)
        elseif i == t
            @constraint(model, old_buys[i].out == buy)
        else
            fix(old_buys[i].out, 0.0)
        end
    end
    if t == 10
        @constraint(sp, A - B <= sum(old_buys[i].out for i = 1:t-1))
    end
end

Rule: you must initialize the same number of state variables in every stage, even if they are not used.

bonnkleiford commented 3 years ago

Oh, great! Thanks a lot, Oscar.

I will try the second code. So it means that if I do not want the sum of all the buys I did, but really only a particular buy, I will do this:

SDDP.LinearPolicyGraph( stages = 50, ) do sp, t @variable(model, x >= 0, SDDP.State, initial_value = 0) @variable(model, buy) @variable(model, old_buys[1:10], SDDP.State, initial_value = 0) for i = 1:50 if i < t @constraint(model, old_buys[i].out == old_buys[i].in) elseif i == t @constraint(model, old_buys[i].out == buy) else fix(old_buys[i].out, 0.0) end end @constraint(sp, A - B <= old_buys[t-10].out) end

My current problem has 30 stages and I want to like: At stage 10: A - B <= buy[t=1] At stage 11: A - B <= buy[t=2] At stage 12: A - B <= buy[t=3] and so on... until stage 30: A - B <= buy[t=20]

With the code above, this becomes: @constraint(sp, A - B <= old_buys[t-10].out)

I think this is it. 🤔

odow commented 3 years ago

Yes, if you want to use a variable from 10 stages ago, then you need 10 state variables to keep track of them.

This also works for things like perishable inventory. You have N bins. New inventory goes into bin 1. Each time step inventory in bin n goes to n+1, and inventory in bin N disappears (perishes).

bonnkleiford commented 3 years ago

My problem is actually an inventory problem with lead time and I did it like this:

model = SDDP.LinearPolicyGraph(
    stages = 50,
    sense = :Max,
    upper_bound = 10000000000,
    optimizer = GLPK.Optimizer
) do sp, t
    @variable(sp, inventory >= 0, SDDP.State, initial_value = 65)
    @variables(sp, begin
        0 <= sell
        0 <= buy <= 1000
        demand
        end)
    @variable(sp, old_buys[1:50], SDDP.State, initial_value = 0)
    for i = 1:50
        if i < t
            @constraint(sp, old_buys[i].out == old_buys[i].in)
        elseif i == t
            @constraint(sp, old_buys[i].out == buy)
        else
            fix(old_buys[i].out, 0.0)
        end
    end

    if t==1
        @constraint(sp, sell <= inventory.out)
        @constraint(sp, inventory.out + sell == initial)

    elseif 2 <= t <= lead_time
        @constraints(sp, begin
                sell <= 25*demand
                sell <= inventory.out
                inventory.out == inventory.in - sell + initial_inventory
                end)
    else
        @constraints(sp, begin
                sell <= 25*demand
                sell <= inventory.out
                inventory.out == inventory.in - sell + old_buys[t-10].out
                end)
    end

    @stageobjective(sp, sales_price*sell - inventory_cost*inventory.out - buying_price*old_buys[t].out)
    SDDP.parameterize(sp, [20,21,22,23,24,25,26,27,28,29,30]) do ω
        JuMP.fix(demand, ω)
    end
end

Having my lead_time=10, I think this already works.

Thank you :)

odow commented 3 years ago

I would formulate it like:

lead_time = 10
initial_inventory = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
model = SDDP.LinearPolicyGraph(
    stages = 50,
    sense = :Max,
    upper_bound = 10_000_000_000,  # Chose a better bound!!! This is too big.
    optimizer = GLPK.Optimizer
) do sp, t
    @variables(sp, begin
        inventory >= 0, SDDP.State, (initial_value = 65)
        leads[1:lead_time], SDDP.State, (initial_value = 0)
        0 <= sell
        0 <= buy <= 1000
    end)
    new_inventory = t <= lead_time ? initial_inventory[t] : leads[lead_time].in
    @constraints(sp, begin
        leads[1].out == buy
        [i = 2:lead_time], leads[i-1].in == leads[i].out
        sell <= inventory.in
        inventory.out == inventory.in - sell + new_inventory
        demand_con, sell <= 0
    end)
    @stageobjective(sp, 
        sales_price * sell - inventory_cost * inventory.out - buying_price * buy
    )
    SDDP.parameterize(sp, 20:30) do ω
        set_normalized_rhs(demand_con, 25 * ω)
    end
end

Hopefully it gives you some ideas. I haven't tested, so there might be typos, etc.

bonnkleiford commented 3 years ago

Oh, thanks. I tried it and it works but the solution gets far out from the original implementation. The one I had above provided close to the optimal solution. Hmmm.

But, really, thanks for the help. I will also try going around it.

odow commented 3 years ago

I probably made a typo somewhere.

Ah. I wrote

sell <= inventory.in

but you have

sell <= inventory.out
bonnkleiford commented 3 years ago

Ahh yes. I tried it now and it also checks out!

Thank you again very much :)

odow commented 3 years ago

I'm going to re-open to remind myself to add this to the documentation.

bonnkleiford commented 3 years ago

Oops my bad. Hehe. Thanks, Oscar! I'll surely have more questions to come.

By the way, I have been looking for your email but I am not aure which one is the correct one.

odow commented 3 years ago

I wrote to your uni.lu address.