LAMPSPUC / HydroPowerModels.jl

A Julia/JuMP package for Hydrothermal Multistage Steady-State Power Network Optimization solved by Stochastic Dual Dynamic Programming (SDDP).
MIT License
10 stars 1 forks source link

fix discount factor #10

Closed Thiago-NovaesB closed 1 year ago

Thiago-NovaesB commented 1 year ago

discount_factor was used as the transition probability

andrewrosemberg commented 1 year ago

(I believe) these are equivalent. Oscar uses the transition probability as the discount factor. See for example: https://odow.github.io/SDDP.jl/stable/examples/infinite_horizon_hydro_thermal/.

It is true that the motivation for a discount factor need not come from a uncertainty about the future directly (e.g., It can come from the decreasing value of money), but these things are related.

odow commented 1 year ago

It is true that the motivation for a discount factor need not come from a uncertainty about the future directly (e.g., It can come from the decreasing value of money), but these things are related.

More than related, these things are identical. A dollar today is worth more than a dollar tomorrow because the world might end (okay, too much, but you might die, or the business you were going to spend your dollar on might go bankrupt overnight). This is the same as saying we transition to the next stage with the discount factor.

Thiago-NovaesB commented 1 year ago

@odow After a little reflecting, I see that with the discount factor, we multiplied the average objective for the period by p. Using transition probability, we average considering (1-p)*N terms to be 0. Since these terms are chosen randomly, in fact these average converge to the same value. But I'm a little doubtful about what happens when we don't have many elements in the sum. Do you know if the variance is the same too? Is the behavior exactly the same? or do they just converge to the same value? Do you know any theoretical result about this?

Thiago-NovaesB commented 1 year ago

I was also unsure about the performance. On the other hand, using transition probability seems that we need to evaluate fewer terms and the average time of each iteration will be smaller, since it can end prematurely. But using a discount factor, the average will be made over more terms, which may indicate that it is better(?). Furthermore, from what I understand, backward will go through the same nodes as forward, right? So with the probability of transition we will have fewer cuts, right? If possible, I would like to know what you think about computational performance as well.

odow commented 1 year ago

Using transition probability, we average considering (1-p)*N terms to be 0.

Correct.

Since these terms are chosen randomly, in fact these average converge to the same value.

Correct.

But I'm a little doubtful about what happens when we don't have many elements in the sum.

Not sure if I understand which sum we're talking about.

Do you know if the variance is the same too?

It will be larger, because everything is in current dollars, and different simulations are of different lengths.

Imagine I have a graph with two nodes, and a discount factor of 0.5. There is no random variable, and the objective is 1 for node 1 and 2 for node 2. You might get sequences of forward passes that look like:

x - 1 - 2                   (cost = 3)
x - 1 - 2 - 1 - 2           (cost = 6)
x - 1 - 2                   (cost = 3)
x - 1 - 2 - 1 - 2 - 1 - 2   (cost = 9)

If you compare against the discount in the objective, you'd get:

x - 1 - 2 - 1 - 2 - 1 - 2   (cost = 1 + 2 + 0.5 * (1 + 2 + 0.5 * (1 + 2)))

These have the same mean (5.25), but the discount in the transition has a variance of 8.25.

Is the behavior exactly the same? or do they just converge to the same value?

They converge to the same value. The iterations will be different because the algorithms walks a different sequence of subproblems.

Do you know any theoretical result about this?

https://doi.org/10.1002/net.21932

I was also unsure about the performance. On the other hand, using transition probability seems that we need to evaluate fewer terms and the average time of each iteration will be smaller, since it can end prematurely. But using a discount factor, the average will be made over more terms, which may indicate that it is better(?). Furthermore, from what I understand, backward will go through the same nodes as forward, right? So with the probability of transition we will have fewer cuts, right? If possible, I would like to know what you think about computational performance as well.

Nice questions. I think you have a good handle on the trade-off.

My guess is that for acyclic graphs (like a finite horizon linear graph used here), it's better to use the discount factor in the objective, because there are only finitely many nodes that can be visited.

I've mainly used discount factors in an infinite horizon setting, in which you cannot use it in the objective function.

You might want to read https://odow.github.io/SDDP.jl/stable/explanation/theory_intro/, which implements the algorithm.

Thiago-NovaesB commented 1 year ago

Thanks for the explanations and for your paper!