[Question] objective_value, evaluate_decision and the parts of the objective.

martinbiel / StochasticPrograms.jl

Julia package for formulating and analyzing stochastic recourse models.

MIT License

75 stars 25 forks source link

[Question] objective_value, evaluate_decision and the parts of the objective. #41

Closed FHell closed 2 years ago

FHell commented 2 years ago

Hi, thank you for your excellent package!

I am a bit confused about the interpretation of the various evaluate_decision and objective_value calls though and the documentation didn't clear things up.

Given a stochastic program like:

x_opt = min_x f(x) + E_s h(x, s)

We need the f(x_opt) variable and the h(x_opt, s) on various s.

We have a scenario that satisfies h(x_opt, s_zero) = 0.

Now the question is, what do the evaluate decision and objective value calls actually do? We noticed that

mean([objective_value(sp, i) for i in 1:n]) ≈ objective_value(sp)

is true

So objective_value(sp, i) should be f(x) + h(x, s_i)

However, evaluate_decision does not reproduce these objective values at all. E.g.:

evaluate_decision(sp, optimal_decision(sp), scens[i])

is larger, but of the same magnitude as

objective_value(sp, i)

Could you explain what is going on here? Is there a way to get to f(x) and h(x) separately?

martinbiel commented 2 years ago

With your notation,

objective_value(sp, i)

is h(x_opt, i), i.e. you query the contribution of scenario i to the objective after running optimization. I would guess that your experiment with the mean does not hold generally.

Note, also, that you will sometimes get a sign discrepancy if the objective sense in the first and second stage do not match. For example, if the first stage objective is minimized and the second stage objective is maximized. The sign of objective(sp, i) will be flipped when the total objective is calculated.

For a two-stage model, objective_value(sp, i) defaults to objective_value(sp, 2, i). In general, the stage should be supplied as well. I will have to revisit this API since it blocks the otherwise possible call objective_value(sp, 1), which could mean the first-stage contribution, i.e. f(x_opt). I currently do not offer a function for just querying f(x_opt) after optimization.

Finally,

evaluate_decision(sp, x, xi)

calculates f(x) + h(x, xi), i.e. the full result of taking decision x and ending up with scenario xi. Hence, evaluate_decision(sp, x_opt, i) is expected to be larger than objective_value(sp, i) by exactly f(x_opt). (bar the sign change explained above). You could use this to calculate f(x_opt) through

f = evaluate_decision(sp, xopt, 1) - (sign change?)*objective_value(sp, 1)

e-zolotarevskaya commented 2 years ago

Hi, I am using your great package together with @FHell. Thank you for your reply!

To test if your suggestion is true in our case we saved objective values and results of evaluate_decision() for all scenarios. If evaluate_decision(sp, x_opt, i) == objective_value(sp, i) +f(x_opt) it should be scenario independent. But by looking at the difference between evaluate_decision() and objective_value(sp,i) for different scenarios we found that not to be the case. The difference varies by as much as a factor of 2. Also the average of evaluate desicison is not the complete objective value.

(Our package is purely minimization, so no sign ambiguity...)

While checking this we also discovered a bug in cache_solution(sp): it seems that it rewrites all scenario objective values to be equal to the total objective value.

Code and data to reproduce is here: https://github.com/PIK-ICoNe/RecoveryModel.jl

The concrete test is this file: https://github.com/PIK-ICoNe/RecoveryModel.jl/blob/main/minimal_example.jl

Here's the key part:

sp = instantiate(es, scens, optimizer = Cbc.Optimizer)

##

optimize!(sp)

##

od = optimal_decision(sp)

ov = objective_value(sp)
# We save objective value and result of evaluate_decision for each scenario
ovs = [objective_value(sp, i) for i in 1:length(scens)]
eds = [evaluate_decision(sp, od, scen) for scen in scens]

##

@show mean(eds) .- ov # -218809.05039871123

##
# If f() = evaluate_decision(sp, od, s)-objective_value(sp,2,s), then eds[i]-ovs[i] should be scenario independent, that is constant. We check if that's true:
@show maximum(eds .- ovs) # -604955.7360808562 
@show minimum(eds .- ovs) #  -1.0477844558753553e6
# Unfortunately it's not the same.

##
# We also found a bug in cache_solution. After it objective_value(sp,2,i) == objective_value(sp) for any i.
cache_solution!(sp)

##

ovs2 = [objective_value(sp, i) for i in 1:length(scens)]

unique(ovs2 .- ov) # [0.0]

FHell commented 2 years ago

I managed to reduce our code to a true minimal working example:

https://github.com/PIK-ICoNe/RecoveryModel.jl/blob/main/mwe.jl

Fully reproduced below. My guess right now is that this has to do with the presence of first stage variables in the second stage objective. At least I couldn't reproduce the problem in the absence of this, e.g. the code in

https://github.com/PIK-ICoNe/RecoveryModel.jl/blob/main/mwe.jl

does not show the problem...

using StochasticPrograms
using Random
using Statistics

Random.seed!(2)

##
n = 5
t_max = 10
prices = collect(1:t_max)

scens = [@scenario t_r = rand(1:t_max-2) probability = 1/n for i in 1:n]

##

system = @stochastic_model begin 
    @stage 1 begin
        @decision(model, 3. >= a[t in 1:t_max] >= 0.)
        @constraint(model, sum(a) == 20.)
        @objective(model, Min, sum(a .* prices))
    end
    @stage 2 begin
        @uncertain t_r
        trs = t_r:t_r+2
        @known(model, a)
        @recourse(model, a2[t in trs] >= 0.)
        @constraint(model, sum(a2) == sum(a[trs]))
        @constraint(model, a2[t_r + 1] == 0)
        @objective(model, Min, sum((a2[trs] .- a[trs]) .* prices[trs])) # Delete  ".- a[trs]" and the problem goes away
    end
end

##

using Cbc

sp = instantiate(system, scens, optimizer = Cbc.Optimizer)

##

optimize!(sp)

##

od = optimal_decision(sp)

ov = objective_value(sp)

ovs = [objective_value(sp, i) for i in 1:length(scens)]
eds = [evaluate_decision(sp, od, scen) for scen in scens]

##
@show ov - evaluate_decision(sp, od) ; # 0.0

nothing
##
# Neither eds nor ovs have ov as the average:

@show mean(eds) .- ov # 24.6
@show mean(ovs) .- ov # -77.0

nothing
##
# If f() = evaluate_decision(sp, od, s)-objective_value(sp,2,s), then eds[i]-ovs[i] should be scenario independent:

@show maximum(eds .- ovs) # 109.0
@show minimum(eds .- ovs) # 91.0

nothing

martinbiel commented 2 years ago

Hello again,

You were correct, the bug stems from the presence of first-stage variables in the second-stage objective. They were being overwritten by the first-stage objective during the decision evaluation. My latest commit to master should have fixed it. I at least get

@show mean(eds) .- ov # 1.4210854715202004e-14

as expected, and eds[i] - ovs[i] is now scenario independent in your example. Thanks for the MWE! I will look into a more robust treatment of the first-stage objective in the future, including getters for just f(x).

I will look into the caching as well. It is hard to cache every value properly, so I have left is an optional feature. I would also not recommend it for large models as it becomes quite time consuming.

FYI I am going on parental leave very soon, and will pause development for a while. Do keep posting issues if you find more bugs. I hope your project works out well!

FHell commented 2 years ago

Brilliant! Thanks a lot, we'll send you the paper once we're done with it. :)

Enjoy your parental leave! I am about to go myself, too :)