Closed rdnfn closed 2 years ago
@enjeeneer Any thoughts?
@rdnfn I haven't come across this behaviour previously. I note one line in particular from the logs: random agent: resetting env
. I suspect RLlib automatically resets the environment upon instantiation which is messing with Energym. I've had problems in the past with energym's .reset()
method which the devs haven't implemented - see here. If this is the case, can we bypass RLlib's auto-reset?
I think you're right, the problem is with the .reset()
method. I think we do want to implement this though as this is a core part of the OpenAI gym interface. Note that energym does implement the .reset()
method, just not in the way of OpenAI gym as it doesn't return an observation, instead it's just
def reset(self):
"""Resets the simulation."""
self.close()
self.kpis.reset()
self.initialize()
Then in the Beobench integration, reset is defined as
def reset(self) -> None:
"""
Resets the energym simulation and return first observation
Args:
None
Returns:
obs (dict):
first observation from reset environment
"""
self.env.reset()
obs = self.env.get_output()
obs = self.obs_converter(obs)
return obs
In their example notebooks, the Energym docs never actually use their own reset function. They just start with env.get_output()
instead. Looking at their implementation the env __init__()
method, they already initialize the simulation already there. Thus, effectively we are initialising twice. I think the problem might come from calling the self.fmu.terminate()
method, this might finish the simulation until the initial end date (running it for a full year). I can't find a good documentation for that terminate() method, so I am uncertain what it exactly does. This is where it is implemented.
I think there are two things to take away:
I proposed a fix in https://github.com/rdnfn/beobench_contrib/pull/2. With this fix the first call of reset() does not trigger a re-initialisation, and thus avoids the behaviour where the entire year-long simulation without inputs is run. Note that there may still be unexpected behaviour with further reset() calls -- I added a warning that is triggered on the second reset() call about potentially unexpected behaviour.
With this currently sufficient but not perfect fix I am closing this issue, but anybody should feel free to reopen it if they discover an issue with repeated reset() calls.
Edit:
Given that RLlib appears to sometimes call reset twice depending on the configuration, the Energym integration now has a gym_kwarg that allows to disable resetting functionality completely. To set it use the following config
env:
config:
gym_kwargs:
ignore_reset: True
Problem
When running
rewex01_test01.yaml
, The output of the Energym simulations suggest that the simulations runs for an entire year before the agent takes the first action. Is this possible? See output below. To reproduce it run the following command in the repo dir on thedev/general
branch:Potential Solution
I am not sure if this is really a bug or whether this output can be explained otherwise somehow.