Which would be the correct way to model the uncertainty from historical data?

WuSiren commented 2 months ago

Hi, Prof. @odow , how do you do? I'm just wondering what's the correct way to model the uncertainty from historical data when using SDDP.jl and some related issues.

Let's take the typical hydrothermal scheduling problem as an example. Suppose in the beginning the uncertain inflow is given by its historical data of the same period (also 3 weeks) of previous $L$ years, e.g., this is a toy dataset where $L$=6:	Week-1	Week-2
0	0	100
0	100	0
100	0	0
50	50	0
50	0	50
0	50	50

In order to use SDDP.jl, we must model the uncertain inflows as stagewise-independent random variables (is it?). I learned from the tutorial and some other related papers and my personal cognition that maybe we can generally model the uncertainty according to the following ideas:

Boldly treat the uncertainty as stagewise-independent and then perform probability statistics for each stage separately. E.g., for the data above, we may have $\Omega_t$={0, 50, 100}, $P_t$={1/2, 1/3, 1/6}, $\forall t$.
Use the state-space expansion trick to build a (vector) auto-regressive model and model the error term $\varepsilon_t$ as stagewise-independent noises.
To avoid introducing new state variables, maybe we can also model the uncertainty as some deterministic time function plus a stagewise-independent error term: $inflow_t=f(t)+\varepsilon_t$.
Leveraging the advantages of the policy graph framework in SDDP.jl, adopt the Markov chain approach to construct an appropriate Markov process.
Maybe we can also treat each trajectory of realizations of the uncertainty as an independent scenario $\omega$ and assign them uniform probabilities $P\omega$. Then, our overall objective will be minimizing the expectation cost w.r.t. the random uncertainty trajectory, i.e., $\mathbb{E}{\omega\sim P} [\sum_tC_t(x_t, u_t, \omega_t)]$.

My questions are:

Is my understanding of the approaches to uncertainty modeling correct? Which approach do you think is most appropriate?
As for the 5th approach mentioned above, does SDDP.jl support such problem formulation? Is there essential differences between it and the Bellman's optimality principle followed by SDDP.jl? Or can they be transformed into each other?
Should we try to avoid using the first bold approach? Or in another word, how much return can we get from modeling the uncertainty? Since, in theory, we use SDDP.jl for multi-stage stochastic programming to train an optimal policy functional $\pi$, ideally this functional should naturally incorporate a sufficiently good noise model. Following this logic, if we model the uncertainty using the first approach and train the SDDP model as long as possible, it should achieve the same effect as when we model the uncertainty using other better methods. Is this true?

Best regards!

odow commented 2 months ago

Is my understanding of the approaches to uncertainty modeling correct?

Yes

Which approach do you think is most appropriate?

This doesn't have an answer. It depends on your problem. Choose your metric to measure how "good" the policy is, and then pick the model that achieves the best policy for your problem. Note that how you model the uncertainty is a modeling choice. It's not the real word, but that doesn't matter.

As for the 5th approach mentioned above, does SDDP.jl support such problem formulation?

Sure, you can make a policy graph with the Markov transition matrices:

[ [1/3 1/3 1/3], [1 0 0; 0 1 0; 0 0 1], ..., [1 0 0; 0 1 0; 0 0 1]

but this means that when we see the realization in stage 1, we know the realization of the uncertainty in all future stages.

Again, this is a modeling choice. Whether it is suitable is up to you.

Should we try to avoid using the first bold approach? Or in another word, how much return can we get from modeling the uncertainty?

I don't have an answer for you because this is problem-dependent.

if we model the uncertainty using the first approach and train the SDDP model as long as possible, it should achieve the same effect as when we model the uncertainty using other better methods. Is this true?

No!!!

Training an "optimal" policy to a poor choice of model will probably perform poorer in practice than a model with a better representation of the world trained for fewer iterations.

WuSiren commented 2 months ago

Thank you very much, @odow ! I'm very excited every time I see your reply and I always learn a lot!

Now I feel I have a deeper understanding on SDDP.jl. But I still have one question. If we take the 5th approach to train a policy based on the scenarios by introducing a Markovian policy graph, then, how should we evaluate a decision rule when we observe a new uncertainty realization in practice? (Since we don't know which node now we have arrived at.)

odow commented 2 months ago

how should we evaluate a decision rule when we observe a new uncertainty realization in practice

Now you see the problem with this approach :smile:

It's up to you. You might want to pick the scenario that is most similar.

Or you might realize that this isn't a very appropriate model for sequential decision making under uncertainty.

WuSiren commented 2 months ago

Thank you @odow ! Thank you very much! 😀🤝🤝

But I think, if I understand correctly, perhaps this problem can be circumvented by pre-specifying some structured decision rule functions for each stage (rather than node), is that right? (Although this may be a very rough approximation).

odow commented 2 months ago

There are other ways to find (sub-optimal) policies for sequential decision problems, such as (linear) decision rules.

SDDP.jl does not support custom decision rules, although it has come up in discussion: https://github.com/odow/SDDP.jl/issues/696

WuSiren commented 2 months ago

I got it!

odow / SDDP.jl

Which would be the correct way to model the uncertainty from historical data? #765