RuntimeError when setting busbar to line extremity

EloyAnguiano commented 9 months ago

Environment

Grid2op version: 1.9.5
lightsim version: 0.7.5
gym: 0.21.0
gymnasium: 0.28.1
stable-baselines3: 2.0.0
System: ubuntu20.04
Grid2Op environment: l2rpn_idf_2023

Bug description

There are some grid states that lead to a RuntimeError whenever I try to set a busbar for a line extremity. The action performed loks like this:

This action will:
- NOT change anything to the injections
- NOT perform any redispatching action
- NOT modify any storage capacity
- NOT perform any curtailment
- NOT force any line status
- NOT switch any line status
- NOT switch anything in the topology
- Set the bus of the following element(s):
- Assign bus 1 to line (extremity) id 73 [on substation 111]
- Not raise any alert

How to reproduce

Code snippet

from copy import deepcopy
import numpy as np
import grid2op

from grid2op.Action import PlayableAction
from grid2op.Chronics import MultifolderWithCache
from grid2op.Converter.Converters import Converter

from gym.spaces import Box, Dict, MultiBinary, MultiDiscrete
from lightsim2grid import LightSimBackend

from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3.common.vec_env import SubprocVecEnv

from l2rpn_baselines.utils.gymenv_custom import GymEnvWithRecoWithDN
from stable_baselines3 import PPO
import gym

class FlatObservationConverter(GymEnvWithRecoWithDN):
    # class FlatObservationConverter(GymEnv_Legacy):
    def __init__(self, env, ex_obs) -> None:  # noqa: ANN001
        # super().__init__(env)  # noqa: ERA001
        super().__init__(env, reward_cumul="sum", safe_max_rho=0.7)

        self.ex_obs = ex_obs

        self.observation_space = self._build_obs_space()

        self.observation_space.close = FlatObservationConverter.new_close
        self.observation_space.to_gym = self.to_gym

    def _build_obs_space(self):
        return Box(shape=self.observation(self.ex_obs).shape,
                   low=-np.inf, high=np.inf)

    def new_close(**kwargs):
        pass

    def seed(self, seed):
        pass

    def step(self, action):
        vals = super().step(action)

        # For differentiate between gym and gymnassium
        if len(vals) == 5:
            obs, reward, done, _, info = vals
        else:
            obs, reward, done, info = vals
        return obs, reward, done, info

    def reset(self):
        vals = super().reset()
        if len(vals) == 2:
            return vals[0]
        return vals

    def to_gym(self, obs):
        return self.observation(obs)

    def observation(self, obs):
        return self._get_observation_vector(obs)

    def _get_observation_vector(self, obs):
        return obs.prod_v

class MultiLevelBusActionConverter(Converter):
    BUSBARS = 2

    def __init__(self, env, action_space, ex_obs):  # noqa: ARG002
        self.action_space = action_space

        self.n_gen = env.init_env.n_gen
        self.n_load = env.init_env.n_load
        self.n_line = env.init_env.n_line
        self.n_storage = env.init_env.n_storage
        self.n_shunt = env.init_env.n_shunt

        self.limit_gen = self.n_gen + 1
        self.limit_load = self.limit_gen + self.n_load
        self.limit_line_or = self.limit_load + self.n_line
        self.limit_line_ex = self.limit_line_or + self.n_line
        self.limit_storage = self.limit_line_ex + self.n_storage

        self.n_elems = self.n_gen + self.n_load + 2 * \
            self.n_line + self.n_storage

        self.gym_space = MultiDiscrete([1 + self.n_elems, 1 + self.BUSBARS])

        self._init_size = self.action_space.size()
        self.__class__ = MultiLevelBusActionConverter.init_grid(
            self.action_space)

    def sample(self):
        return self.gym_space.sample()

    @property
    def nvec(self):
        return self.gym_space.nvec

    def from_gym(self, gym_act):
        elem, value = gym_act[0], gym_act[1]

        if elem == 0:
            # Do nothing
            return self.action_space({})

        if value == 0:
            # Disconnect element
            value = -1

        action = self.action_space({})
        if elem < self.limit_gen:
            # Element is Generator
            action.gen_set_bus = (elem - 1, value)

        elif elem < self.limit_load:
            # Element is Load
            action.load_set_bus = (elem - self.limit_gen, value)

        elif elem < self.limit_line_or:
            # Element is Line or
            action.line_or_set_bus = (elem - self.limit_load, value)

        elif elem < self.limit_line_ex:
            # Element is Line ex
            action.line_ex_set_bus = (elem - self.limit_line_or, value)

        elif elem < self.limit_storage:
            # Element is Storage
            action.storage_set_bus = (elem - self.limit_line_ex, value)

        else:
            raise ValueError()

        return action

class GymEnvMultiProc(gym.Env):
    def __init__(self, gym_env_g2op):
        self.gym_env_g2op = gym_env_g2op

        # Action
        gym_act_space = gym_env_g2op.action_space
        self.action_space = MultiDiscrete(gym_act_space.nvec)

        # Observation
        spaces = gym_env_g2op.observation_space
        if isinstance(spaces, Box):
            self.observation_space = Box(
                shape=spaces.shape, low=spaces.low, high=spaces.high)
        else:
            self.observation_space = Dict(spaces=deepcopy(spaces))

    def step(self, *args, **kwargs):
        return self.gym_env_g2op.step(*args, **kwargs)

    def reset(self, *args, **kwargs):
        return self.gym_env_g2op.reset(*args, **kwargs)

    def seed(self, *args, **kwargs):
        return self.gym_env_g2op.seed(*args, **kwargs)

def make_env(name):
    grid2op_env = grid2op.make(name,
                               backend=LightSimBackend(),
                               test=False,
                               action_class=PlayableAction,
                               chronics_class=MultifolderWithCache)

    # CHRONICS
    _ = grid2op_env.chronics_handler.reset()
    grid2op_env.set_max_iter(864)  # 3days

    # GYM ENVIRONMENT
    obs = grid2op_env.reset()

    # OBSERVATION SPACE
    gym_env = FlatObservationConverter(grid2op_env, obs)

    gym_env.action_space = MultiLevelBusActionConverter(
        gym_env, grid2op_env.action_space, obs)

    return GymEnvMultiProc(gym_env)

if __name__ == "__main__":
    # ENVIRONMENT
    env_kwargs = {"name": "l2rpn_idf_2023"}

    gym_env = make_vec_env(make_env,
                           env_kwargs=env_kwargs,
                           n_envs=16,
                           vec_env_cls=SubprocVecEnv)
    # gym_env = make_env("l2rpn_idf_2023")
    agent = PPO("MlpPolicy",
                gym_env,
                n_steps=1000,
                batch_size=16,
                tensorboard_log=f"./TEST",
                verbose=1)

    agent.learn(total_timesteps=100000, progress_bar=False)

Current output

Using cuda device
Logging to ./TEST/PPO_2
Process ForkServerProcess-14:
Traceback (most recent call last):
  File "/home/eloy.anguiano/miniconda3/envs/l2rpn_new/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/eloy.anguiano/miniconda3/envs/l2rpn_new/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/eloy.anguiano/miniconda3/envs/l2rpn_new/lib/python3.10/site-packages/stable_baselines3/common/vec_env/subproc_vec_env.py", line 35, in _worker
    observation, reward, terminated, truncated, info = env.step(data)
  File "/home/eloy.anguiano/miniconda3/envs/l2rpn_new/lib/python3.10/site-packages/stable_baselines3/common/monitor.py", line 94, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
  File "/home/eloy.anguiano/miniconda3/envs/l2rpn_new/lib/python3.10/site-packages/shimmy/openai_gym_compatibility.py", line 257, in step
    obs, reward, done, info = self.gym_env.step(action)
  File "/home/eloy.anguiano/repos/tesis-eab/minim_script.py", line 202, in step
    return self.gym_env_g2op.step(*args, **kwargs)
  File "/home/eloy.anguiano/repos/tesis-eab/minim_script.py", line 44, in step
    vals = super().step(action)
  File "/home/eloy.anguiano/miniconda3/envs/l2rpn_new/lib/python3.10/site-packages/l2rpn_baselines/utils/gymenv_custom.py", line 238, in step
    g2op_obs, reward, done, info = self.init_env.step(g2op_act)
  File "/home/eloy.anguiano/miniconda3/envs/l2rpn_new/lib/python3.10/site-packages/grid2op/Environment/baseEnv.py", line 3211, in step
    self.backend.apply_action(self._backend_action)
  File "/home/eloy.anguiano/miniconda3/envs/l2rpn_new/lib/python3.10/site-packages/lightsim2grid/lightSimBackend.py", line 696, in apply_action
    self._grid.update_storages_p(backendAction.storage_power.changed,
RuntimeError: DataLoad::change_p: Impossible to change the active value of a disconnected load (check load id 5)
Traceback (most recent call last):
  File "/home/eloy.anguiano/repos/tesis-eab/minim_script.py", line 250, in <module>
    agent.learn(total_timesteps=100000, progress_bar=False)
  File "/home/eloy.anguiano/miniconda3/envs/l2rpn_new/lib/python3.10/site-packages/stable_baselines3/ppo/ppo.py", line 308, in learn
    return super().learn(
  File "/home/eloy.anguiano/miniconda3/envs/l2rpn_new/lib/python3.10/site-packages/stable_baselines3/common/on_policy_algorithm.py", line 259, in learn
    continue_training = self.collect_rollouts(self.env, callback, self.rollout_buffer, n_rollout_steps=self.n_steps)
  File "/home/eloy.anguiano/miniconda3/envs/l2rpn_new/lib/python3.10/site-packages/stable_baselines3/common/on_policy_algorithm.py", line 178, in collect_rollouts
    new_obs, rewards, dones, infos = env.step(clipped_actions)
  File "/home/eloy.anguiano/miniconda3/envs/l2rpn_new/lib/python3.10/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 197, in step
    return self.step_wait()
  File "/home/eloy.anguiano/miniconda3/envs/l2rpn_new/lib/python3.10/site-packages/stable_baselines3/common/vec_env/subproc_vec_env.py", line 130, in step_wait
    results = [remote.recv() for remote in self.remotes]
  File "/home/eloy.anguiano/miniconda3/envs/l2rpn_new/lib/python3.10/site-packages/stable_baselines3/common/vec_env/subproc_vec_env.py", line 130, in <listcomp>
    results = [remote.recv() for remote in self.remotes]
  File "/home/eloy.anguiano/miniconda3/envs/l2rpn_new/lib/python3.10/multiprocessing/connection.py", line 255, in recv
    buf = self._recv_bytes()
  File "/home/eloy.anguiano/miniconda3/envs/l2rpn_new/lib/python3.10/multiprocessing/connection.py", line 419, in _recv_bytes
    buf = self._recv(4)
  File "/home/eloy.anguiano/miniconda3/envs/l2rpn_new/lib/python3.10/multiprocessing/connection.py", line 388, in _recv
    raise EOFError
EOFError

Expected output

If it really is an action that cannot be performed It should lead to some illegal action or error action and truncate the episode, not raising a RuntimeError that stops the execution.

BDonnot commented 9 months ago

Can you provide me with your training script ? I cannot reproduce it with all the things I have done :-/

EloyAnguiano commented 9 months ago

Yes, I will "get a photo" of the state of the grid when this happens to reproduce it. Do you recommend checking any value in particular apart from topology and load values?

BDonnot commented 9 months ago

The training script would be better. the "photo" of the grid is not that usefull :-/

Also, is this for the environment (env.step) or the forecast (obs.simulate or obs.get_forecasted_env) or the Simulator (obs.get_simulator) ?

BDonnot commented 9 months ago

Do you recommend checking any value in particular apart from topology and load values?

Problem is in the topology, somehow a load is disconnected but the flag "done" is not True

BDonnot commented 9 months ago

Also have you checked that:

1) you did not recode a RL model from scratch and use for example stable-baselines3 or ray / rllib or any other "presumed without bug" machine learning model ? If you coded the training of a ML / RL model from scratch, please check that you don't use the grid when done=False 2) which gym env are you using ? which version of gymnasium (or gym if you are stick with gym) are you using ? 3) you train on a single machine single core ? or is there any parralelllism / asynchron happening somewhere ? If you are you sure you don't use the same backend for different environment ? 4) (i'll keep writing if I get other ideas)

For point 3 above I mean:

you SHOULD NOT DO:

import grid2op
from lightsim2grid import LightSimBackend

backend = LightSimBackend()
env1 = grid2op.make(env_name, backend=backend)
env2 = grid2op.make(env_name, backend=backend)

But always:

import grid2op
from lightsim2grid import LightSimBackend

env1 = grid2op.make(env_name, backend=LightSimBackend())
env2 = grid2op.make(env_name, backend=LightSimBackend())

EloyAnguiano commented 9 months ago

That should not be a problem as I instantiate the backend at each subprocess. You have a code snippet above. I am going to keep updating it to make it simpler for the problem

EloyAnguiano commented 9 months ago

Is this enought? Do you need any explanation of any step or class?

BDonnot commented 9 months ago

I'll have a look asap :-) But that's a first step yes. Thanks a lot :-)

BDonnot commented 9 months ago

Oh I see this is caused by storage units and not by generator or loads. I will see how to reproduce it and bring a fix for this.

BDonnot commented 8 months ago

By the way, if you want to get rid of this issue, you can install the package from source (following the instruction) or from a compiled version here for example: https://github.com/BDonnot/lightsim2grid/actions/runs/6668902300 (choose the right file to download depending on your python version)

BDonnot commented 8 months ago

Or you can pip install the development version: pip install LightSim2Grid==0.7.6.dev0

BDonnot commented 3 months ago

Should be fixed by now. Re open if you still encounter this bug :-)

BDonnot / lightsim2grid