Simulate reward is wrong

djmax008 commented 4 years ago

Environment

Grid2op version: 0.9.4
Lightsim2grid version: 0.2.0
System: Centos7
Additional system information

Bug description

If the first chronic is game over. The obs.simulate in the following chronics will be always game over no matter what.

How to reproduce

import numpy as np
from grid2op.MakeEnv import make
from grid2op.Action import *
import grid2op
import os
from lightsim2grid.LightSimBackend import LightSimBackend
backend = LightSimBackend()
env = make('l2rpn_wcci_2020',
           action_class=PowerlineChangeAction,
           backend=backend)
epi = 0
while epi <= 1:
    print('#################',epi)
    obs = env.reset()
    action_count = {}
    epi += 1 
    done = False
    step = 0
    while not done:
        action_transf_dict = env.action_space({})
        obs_simulate, reward_simulate, done_simulate, infor = obs.simulate(action_transf_dict)
        obs, reward, done, info = env.step(action_transf_dict)
        step +=1
        if abs(reward_simulate - reward) > 50:
          print('Reward of step {}, Reward of simulate {}, step {}'.format(reward, reward_simulate, step))

Command line

run above code for several times and the reward will be different during step and simulate.

Some suggestions

After I replaced the reset function in lightsimbackend with version 1.0, the bug disappeared. Thus, I think the backend could not be reset even if the environment has been reset.

BDonnot commented 4 years ago

I cannot use this code to reproduce. Too much information. Can you please make a code that can reproduce your issue when calling step only à few times and same for simulate?

Thanks

djmax008 commented 4 years ago

Hi, I have updated to lighsim2grid v0.2.1. Now everything is good, I did not see anything wrong now. I will close this session. Thanks,

BDonnot / lightsim2grid