Differentiate between observations and states

Digitalized-Energy-Systems / opfgym

A gymnasium-compatible framework to create reinforcement learning (RL) environment for solving the optimal power flow (OPF) problem. Contains five OPF benchmark environments for comparable research.

MIT License

2 stars 0 forks source link

Currently, only the observation space gets defined. The same space is also used for sampling, which can result in side effects. For example, if the observation space is reduced (partial obs), the non-observed state channels might not sample correctly anymore.

Approach: Explicitly define state space for sampling and observation space for observations. Further things to to:

Add two methods to return obs and state, respectively.
Store state in info dict. (so that the algorithm can use the state as well)
Open: How to deal with results? They require a costly power flow computation and are not strictly required for Markov. However, they are part of the power system's state.

Related to #10 and #8.

Digitalized-Energy-Systems / opfgym

Differentiate between observations and states #29