rdnfn / beobench

A toolkit providing easy and unified access to building control environments for reinforcement learning (RL).
https://beobench.readthedocs.io
MIT License
37 stars 4 forks source link

Questionable warm-up period in Energym integration #43

Closed rdnfn closed 2 years ago

rdnfn commented 2 years ago

Problem

When running rewex01_test01.yaml, The output of the Energym simulations suggest that the simulations runs for an entire year before the agent takes the first action. Is this possible? See output below. To reproduce it run the following command in the repo dir on the dev/general branch:

beobench run -c ./beobench/data/configs/rewex01_test01.yaml -d .[extended]
Reading input and weather file for preprocessor program.
The IDF version of the input file ///root/Energym_runs/1648646930_1219589//resources//Apartments2_heavy_insulated_pump.idf starts with 9
Successfully finish reading weather file.
This is the Begin Month: 1
Time (0) set is smaller than minimun allowed (1 day). Day will be set to 1.
This is the Day of the Begin Month: 1
This is the End Month: 12
This is the Day of the End Month: 31
Day of week was left blank in input file.
This is the New Day of Week:  
Running EPMacro...
ExpandObjects Started.
No expanded file generated.
ExpandObjects Finished. Time:     0.475
EnergyPlus Starting
EnergyPlus, Version 9.5.0-de239b2e5f, YMD=2022.03.30 13:28
Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>]
Adjusting Air System Sizing
Adjusting Standard 62.1 Ventilation Sizing
Initializing Simulation
Reporting Surfaces
Beginning Primary Simulation
Initializing New Environment Parameters
Warming up {1}
Instantiating FunctionalMockupUnitExport interface
ExternalInterface initializes.
Number of outputs in ExternalInterface = 68
Number of inputs  in ExternalInterface = 13
Calculating Detailed Daylighting Factors, Start Date=01/01
Warming up {2}
Warming up {3}
Warming up {4}
Warming up {5}
Warming up {6}
Starting Simulation at 01/01/2017 for ALL_YEAR
ExternalInterface starts first data exchange.
Updating Shadowing Calculations, Start Date=01/21/2017
Updating Detailed Daylighting Factors, Start Date=01/21
Continuing Simulation at 01/21/2017 for ALL_YEAR
Updating Shadowing Calculations, Start Date=02/10/2017
Updating Detailed Daylighting Factors, Start Date=02/10
Continuing Simulation at 02/10/2017 for ALL_YEAR
Updating Shadowing Calculations, Start Date=03/02/2017
Updating Detailed Daylighting Factors, Start Date=03/02
Continuing Simulation at 03/02/2017 for ALL_YEAR
Updating Shadowing Calculations, Start Date=03/22/2017
Updating Detailed Daylighting Factors, Start Date=03/22
Continuing Simulation at 03/22/2017 for ALL_YEAR
Updating Shadowing Calculations, Start Date=04/11/2017
Updating Detailed Daylighting Factors, Start Date=04/11
Continuing Simulation at 04/11/2017 for ALL_YEAR
Updating Shadowing Calculations, Start Date=05/01/2017
Updating Detailed Daylighting Factors, Start Date=05/01
Continuing Simulation at 05/01/2017 for ALL_YEAR
Updating Shadowing Calculations, Start Date=05/21/2017
Updating Detailed Daylighting Factors, Start Date=05/21
Continuing Simulation at 05/21/2017 for ALL_YEAR
Updating Shadowing Calculations, Start Date=06/10/2017
Updating Detailed Daylighting Factors, Start Date=06/10
Continuing Simulation at 06/10/2017 for ALL_YEAR
Updating Shadowing Calculations, Start Date=06/30/2017
Updating Detailed Daylighting Factors, Start Date=06/30
Continuing Simulation at 06/30/2017 for ALL_YEAR
Updating Shadowing Calculations, Start Date=07/20/2017
Updating Detailed Daylighting Factors, Start Date=07/20
Continuing Simulation at 07/20/2017 for ALL_YEAR
Updating Shadowing Calculations, Start Date=08/09/2017
Updating Detailed Daylighting Factors, Start Date=08/09
Continuing Simulation at 08/09/2017 for ALL_YEAR
Updating Shadowing Calculations, Start Date=08/29/2017
Updating Detailed Daylighting Factors, Start Date=08/29
Continuing Simulation at 08/29/2017 for ALL_YEAR
Updating Shadowing Calculations, Start Date=09/18/2017
Updating Detailed Daylighting Factors, Start Date=09/18
Continuing Simulation at 09/18/2017 for ALL_YEAR
Updating Shadowing Calculations, Start Date=10/08/2017
Updating Detailed Daylighting Factors, Start Date=10/08
Continuing Simulation at 10/08/2017 for ALL_YEAR
Updating Shadowing Calculations, Start Date=10/28/2017
Updating Detailed Daylighting Factors, Start Date=10/28
Continuing Simulation at 10/28/2017 for ALL_YEAR
Updating Shadowing Calculations, Start Date=11/17/2017
Updating Detailed Daylighting Factors, Start Date=11/17
Continuing Simulation at 11/17/2017 for ALL_YEAR
Updating Shadowing Calculations, Start Date=12/07/2017
Updating Detailed Daylighting Factors, Start Date=12/07
Continuing Simulation at 12/07/2017 for ALL_YEAR
Updating Shadowing Calculations, Start Date=12/27/2017
Updating Detailed Daylighting Factors, Start Date=12/27
Continuing Simulation at 12/27/2017 for ALL_YEAR
Writing tabular output file results using comma format.
Writing tabular output file results using HTML format.
 ReadVarsESO program starting.
 ReadVars Run Time=00hr 02min 10.71sec
 ReadVarsESO program completed successfully.
 ReadVarsESO program starting.
 Requested ESO file=Apartments2_heavy_insulated_pump.mtr
 does not exist.  ReadVarsESO program terminated.
 ReadVarsESO program terminated.
EnergyPlus Run Time=00hr 22min 33.38sec
EnergyPlus Completed Successfully.
Reading input and weather file for preprocessor program.
The IDF version of the input file ///root/Energym_runs/1648648289_4677490//resources//Apartments2_heavy_insulated_pump.idf starts with 9
Successfully finish reading weather file.
This is the Begin Month: 1
Time (0) set is smaller than minimun allowed (1 day). Day will be set to 1.
This is the Day of the Begin Month: 1
This is the End Month: 12
This is the Day of the End Month: 31
Day of week was left blank in input file.
This is the New Day of Week:  
Running EPMacro...
ExpandObjects Started.
No expanded file generated.
ExpandObjects Finished. Time:     0.479
EnergyPlus Starting
EnergyPlus, Version 9.5.0-de239b2e5f, YMD=2022.03.30 13:51
Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>]
Adjusting Air System Sizing
Adjusting Standard 62.1 Ventilation Sizing
Initializing Simulation
Reporting Surfaces
Beginning Primary Simulation
Initializing New Environment Parameters
Warming up {1}
Instantiating FunctionalMockupUnitExport interface
ExternalInterface initializes.
Number of outputs in ExternalInterface = 68
Number of inputs  in ExternalInterface = 13
Calculating Detailed Daylighting Factors, Start Date=01/01
Warming up {2}
Warming up {3}
Warming up {4}
Warming up {5}
Warming up {6}
Starting Simulation at 01/01/2017 for ALL_YEAR
ExternalInterface starts first data exchange.
Note: RLlib beobench integration not available.
Random agent: creating env.
[OK] fmi2Instantiate: The Resource location of FMU with instance name %s is %s.

[WARNING] fmi2Instantiate: Argument loggingOn is set to %d
. This is not supported. loggingOn will default to '0'.

[OK] The current working directory is %s

[OK] fmi2Instantiate: Path to fmuUnzipLocation %s

[OK] fmi2Instantiate: Path to fmuResourceLocation %s

[OK] Command executes to copy content of resources folder: %s

[OK] fmi2Instantiate: Path to model description file is %s.

[OK] fmi2Instantiate: The FMU modelIdentifier is %s.

[OK] fmi2Instantiate: The FMU modelGUID is %s.

[OK] fmi2Instantiate: Slave %s is instantiated.

[OK] fmi2Instantiate: Instantiation of %s succeded.

[OK] fmi2EnterInitializationMode: The sockfd is %d.

[OK] fmi2EnterInitializationMode: The port number is %d.

[OK] fmi2EnterInitializationMode: This hostname is %s.

[OK] fmi2EnterInitializationMode: TCPServer Server waiting for clients on port: %d.

[OK] fmi2EnterInitializationMode: The number of input variables is %d.

[OK] fmi2EnterInitializationMode: The number of output variables is %d.

[OK] Get input file from resource folder %s.

[OK] Searching for following pattern %s

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Found matching file %s.

[OK] done searching pattern %s

[OK] Get input file from resource folder %s.

[OK] Searching for following pattern %s

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Found matching file %s.

[OK] done searching pattern %s

[OK] Get input file from resource folder %s.

[OK] Searching for following pattern %s

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Found matching file %s.

[OK] done searching pattern %s

[OK] This version uses the **energyplus** command line interface to  call the EnergyPlus executable. **RunEPlus.bat** and **runenergyplus** , which were used in earlier versions, were deprecated as of August 2015.
[OK] fmi2EnterInitializationMode: The connection has been accepted.

[OK] fmi2EnterInitializationMode: Slave %s is initialized.

Random agent: resetting env.
[OK] fmi2Terminate: fmiFreeInstanceSlave must be called to free the FMU instance.

[OK] fmi2FreeInstance: The function fmi2FreeInstance of instance %s is executed.

[OK] freeInstanceResources: %s will be freed.

[OK] fmi2Instantiate: The Resource location of FMU with instance name %s is %s.

[WARNING] fmi2Instantiate: Argument loggingOn is set to %d
. This is not supported. loggingOn will default to '0'.

[OK] The current working directory is %s

[OK] fmi2Instantiate: Path to fmuUnzipLocation %s

[OK] fmi2Instantiate: Path to fmuResourceLocation %s

[OK] Command executes to copy content of resources folder: %s

[OK] fmi2Instantiate: Path to model description file is %s.

[OK] fmi2Instantiate: The FMU modelIdentifier is %s.

[OK] fmi2Instantiate: The FMU modelGUID is %s.

[OK] fmi2Instantiate: Slave %s is instantiated.

[OK] fmi2Instantiate: Instantiation of %s succeded.

[OK] fmi2EnterInitializationMode: The sockfd is %d.

[OK] fmi2EnterInitializationMode: The port number is %d.

[OK] fmi2EnterInitializationMode: This hostname is %s.

[OK] fmi2EnterInitializationMode: TCPServer Server waiting for clients on port: %d.

[OK] fmi2EnterInitializationMode: The number of input variables is %d.

[OK] fmi2EnterInitializationMode: The number of output variables is %d.

[OK] Get input file from resource folder %s.

[OK] Searching for following pattern %s

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Found matching file %s.

[OK] done searching pattern %s

[OK] Get input file from resource folder %s.

[OK] Searching for following pattern %s

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Found matching file %s.

[OK] done searching pattern %s

[OK] Get input file from resource folder %s.

[OK] Searching for following pattern %s

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Read directory and search for *.idf, *.epw, or *.idd file.

[OK] Found matching file %s.

[OK] done searching pattern %s

[OK] This version uses the **energyplus** command line interface to  call the EnergyPlus executable. **RunEPlus.bat** and **runenergyplus** , which were used in earlier versions, were deprecated as of August 2015.
[OK] fmi2EnterInitializationMode: The connection has been accepted.

[OK] fmi2EnterInitializationMode: Slave %s is initialized.

Random agent: taking action.
OrderedDict([('Bd_Ch_EV1Bat_sp', array([-0.5792162], dtype=float32)), ('Bd_Ch_EV2Bat_sp', array([-0.79893726], dtype=float32)), ('Bd_Pw_Bat_sp', array([-0.55746967], dtype=float32)), ('P1_T_Thermostat_sp', array([-0.83537054], dtype=float32)), ('P1_onoff_HP_sp', 1), ('P2_T_Thermostat_sp', array([-0.4884674], dtype=float32)), ('P2_onoff_HP_sp', 0), ('P3_T_Thermostat_sp', array([-0.346657], dtype=float32)), ('P3_onoff_HP_sp', 1), ('P4_T_Thermostat_sp', array([-0.31184208], dtype=float32)), ('P4_onoff_HP_sp', 0)])
Random agent: taking action.
OrderedDict([('Bd_Ch_EV1Bat_sp', array([0.78278834], dtype=float32)), ('Bd_Ch_EV2Bat_sp', array([0.2201611], dtype=float32)), ('Bd_Pw_Bat_sp', array([0.27057496], dtype=float32)), ('P1_T_Thermostat_sp', array([0.64747214], dtype=float32)), ('P1_onoff_HP_sp', 1), ('P2_T_Thermostat_sp', array([-0.22706172], dtype=float32)), ('P2_onoff_HP_sp', 1), ('P3_T_Thermostat_sp', array([0.6409295], dtype=float32)), ('P3_onoff_HP_sp', 1), ('P4_T_Thermostat_sp', array([0.51800954], dtype=float32)), ('P4_onoff_HP_sp', 0)])
Random agent: taking action.
OrderedDict([('Bd_Ch_EV1Bat_sp', array([0.7659943], dtype=float32)), ('Bd_Ch_EV2Bat_sp', array([-0.04078925], dtype=float32)), ('Bd_Pw_Bat_sp', array([0.7055465], dtype=float32)), ('P1_T_Thermostat_sp', array([-0.50344086], dtype=float32)), ('P1_onoff_HP_sp', 1), ('P2_T_Thermostat_sp', array([0.8202817], dtype=float32)), ('P2_onoff_HP_sp', 1), ('P3_T_Thermostat_sp', array([-0.8721977], dtype=float32)), ('P3_onoff_HP_sp', 0), ('P4_T_Thermostat_sp', array([-0.52730197], dtype=float32)), ('P4_onoff_HP_sp', 0)])
Random agent: taking action.
/opt/beobench/experiment_setup/energymGymEnv.py:119: DeprecationWarning: invalid escape sequence \d
  self.temps = list(filter(lambda t: match("Z\d\d_T", t), self.obs_keys))
/opt/beobench/experiment_setup/energymGymEnv.py:393: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  def compute_reward(self, observation: dict) -> np.float:
/usr/local/lib/python3.8/dist-packages/energym-0.1-py3.8.egg/energym/schedules/EVSchedule.py:60: FutureWarning: Series.dt.weekofyear and Series.dt.week have been deprecated. Please use Series.dt.isocalendar().week instead.
  self.schedule["Week"] = self.schedule["Time"].dt.week
/usr/local/lib/python3.8/dist-packages/energym-0.1-py3.8.egg/energym/schedules/EVSchedule.py:60: FutureWarning: Series.dt.weekofyear and Series.dt.week have been deprecated. Please use Series.dt.isocalendar().week instead.
  self.schedule["Week"] = self.schedule["Time"].dt.week
/usr/local/lib/python3.8/dist-packages/energym-0.1-py3.8.egg/energym/schedules/EVSchedule.py:177: FutureWarning: Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.
  indices = [self.schedule.index.get_loc(dt, method="nearest") for dt in dts]
/opt/beobench/experiment_setup/energymGymEnv.py:432: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  return np.float(reward)
OrderedDict([('Bd_Ch_EV1Bat_sp', array([0.44146132], dtype=float32)), ('Bd_Ch_EV2Bat_sp', array([-0.7559141], dtype=float32)), ('Bd_Pw_Bat_sp', array([-0.4943884], dtype=float32)), ('P1_T_Thermostat_sp', array([0.12277272], dtype=float32)), ('P1_onoff_HP_sp', 1), ('P2_T_Thermostat_sp', array([0.23925698], dtype=float32)), ('P2_onoff_HP_sp', 1), ('P3_T_Thermostat_sp', array([0.2703274], dtype=float32)), ('P3_onoff_HP_sp', 0), ('P4_T_Thermostat_sp', array([-0.6166424], dtype=float32)), ('P4_onoff_HP_sp', 0)])
Random agent: taking action.
OrderedDict([('Bd_Ch_EV1Bat_sp', array([0.37055752], dtype=float32)), ('Bd_Ch_EV2Bat_sp', array([-0.3510263], dtype=float32)), ('Bd_Pw_Bat_sp', array([-0.2277605], dtype=float32)), ('P1_T_Thermostat_sp', array([-0.86376804], dtype=float32)), ('P1_onoff_HP_sp', 1), ('P2_T_Thermostat_sp', array([-0.9350178], dtype=float32)), ('P2_onoff_HP_sp', 1), ('P3_T_Thermostat_sp', array([0.7757111], dtype=float32)), ('P3_onoff_HP_sp', 1), ('P4_T_Thermostat_sp', array([-0.08315059], dtype=float32)), ('P4_onoff_HP_sp', 1)])
Random agent: taking action.
OrderedDict([('Bd_Ch_EV1Bat_sp', array([-0.01111914], dtype=float32)), ('Bd_Ch_EV2Bat_sp', array([0.9637071], dtype=float32)), ('Bd_Pw_Bat_sp', array([-0.87789124], dtype=float32)), ('P1_T_Thermostat_sp', array([0.20853722], dtype=float32)), ('P1_onoff_HP_sp', 1), ('P2_T_Thermostat_sp', array([-0.8076891], dtype=float32)), ('P2_onoff_HP_sp', 1), ('P3_T_Thermostat_sp', array([-0.843262], dtype=float32)), ('P3_onoff_HP_sp', 0), ('P4_T_Thermostat_sp', array([-0.63166785], dtype=float32)), ('P4_onoff_HP_sp', 1)])
Random agent: taking action.
OrderedDict([('Bd_Ch_EV1Bat_sp', array([-0.7527687], dtype=float32)), ('Bd_Ch_EV2Bat_sp', array([-0.873507], dtype=float32)), ('Bd_Pw_Bat_sp', array([0.37959477], dtype=float32)), ('P1_T_Thermostat_sp', array([-0.37768596], dtype=float32)), ('P1_onoff_HP_sp', 0), ('P2_T_Thermostat_sp', array([-0.23146532], dtype=float32)), ('P2_onoff_HP_sp', 0), ('P3_T_Thermostat_sp', array([-0.99944913], dtype=float32)), ('P3_onoff_HP_sp', 1), ('P4_T_Thermostat_sp', array([0.00051912], dtype=float32)), ('P4_onoff_HP_sp', 0)])
Random agent: taking action.
OrderedDict([('Bd_Ch_EV1Bat_sp', array([-0.8320181], dtype=float32)), ('Bd_Ch_EV2Bat_sp', array([0.09077284], dtype=float32)), ('Bd_Pw_Bat_sp', array([0.17681736], dtype=float32)), ('P1_T_Thermostat_sp', array([0.9338784], dtype=float32)), ('P1_onoff_HP_sp', 0), ('P2_T_Thermostat_sp', array([0.8568551], dtype=float32)), ('P2_onoff_HP_sp', 1), ('P3_T_Thermostat_sp', array([0.93643355], dtype=float32)), ('P3_onoff_HP_sp', 0), ('P4_T_Thermostat_sp', array([0.09125433], dtype=float32)), ('P4_onoff_HP_sp', 1)])
Random agent: taking action.
OrderedDict([('Bd_Ch_EV1Bat_sp', array([-0.9112033], dtype=float32)), ('Bd_Ch_EV2Bat_sp', array([0.9594012], dtype=float32)), ('Bd_Pw_Bat_sp', array([-0.80443734], dtype=float32)), ('P1_T_Thermostat_sp', array([-0.92528975], dtype=float32)), ('P1_onoff_HP_sp', 0), ('P2_T_Thermostat_sp', array([0.8676486], dtype=float32)), ('P2_onoff_HP_sp', 1), ('P3_T_Thermostat_sp', array([-0.13093314], dtype=float32)), ('P3_onoff_HP_sp', 0), ('P4_T_Thermostat_sp', array([0.20742875], dtype=float32)), ('P4_onoff_HP_sp', 1)])
Random agent: taking action.
OrderedDict([('Bd_Ch_EV1Bat_sp', array([0.5088712], dtype=float32)), ('Bd_Ch_EV2Bat_sp', array([-0.16283542], dtype=float32)), ('Bd_Pw_Bat_sp', array([-0.20831573], dtype=float32)), ('P1_T_Thermostat_sp', array([-0.73916245], dtype=float32)), ('P1_onoff_HP_sp', 1), ('P2_T_Thermostat_sp', array([-0.64466465], dtype=float32)), ('P2_onoff_HP_sp', 0), ('P3_T_Thermostat_sp', array([0.02287476], dtype=float32)), ('P3_onoff_HP_sp', 0), ('P4_T_Thermostat_sp', array([0.4030808], dtype=float32)), ('P4_onoff_HP_sp', 1)])
Random agent: completed test.
Error: The server closed the socket while the client was reading.
Error: The server closed the socket while the client was reading.
**FATAL:Error in ExternalInterface: Check EnergyPlus *.err file.
EnergyPlus Run Time=00hr 00min 38.69sec
Program terminated: EnergyPlus Terminated--Error(s) Detected.
Note: RLlib beobench integration not available.

Potential Solution

I am not sure if this is really a bug or whether this output can be explained otherwise somehow.

rdnfn commented 2 years ago

@enjeeneer Any thoughts?

enjeeneer commented 2 years ago

@rdnfn I haven't come across this behaviour previously. I note one line in particular from the logs: random agent: resetting env. I suspect RLlib automatically resets the environment upon instantiation which is messing with Energym. I've had problems in the past with energym's .reset() method which the devs haven't implemented - see here. If this is the case, can we bypass RLlib's auto-reset?

rdnfn commented 2 years ago

I think you're right, the problem is with the .reset() method. I think we do want to implement this though as this is a core part of the OpenAI gym interface. Note that energym does implement the .reset() method, just not in the way of OpenAI gym as it doesn't return an observation, instead it's just

def reset(self):
    """Resets the simulation."""
    self.close()
    self.kpis.reset()
    self.initialize()

Then in the Beobench integration, reset is defined as

def reset(self) -> None:
    """
    Resets the energym simulation and return first observation
    Args:
        None
    Returns:
        obs (dict):
            first observation from reset environment
    """
    self.env.reset()
    obs = self.env.get_output()
    obs = self.obs_converter(obs)

    return obs

In their example notebooks, the Energym docs never actually use their own reset function. They just start with env.get_output() instead. Looking at their implementation the env __init__() method, they already initialize the simulation already there. Thus, effectively we are initialising twice. I think the problem might come from calling the self.fmu.terminate() method, this might finish the simulation until the initial end date (running it for a full year). I can't find a good documentation for that terminate() method, so I am uncertain what it exactly does. This is where it is implemented.

I think there are two things to take away:

rdnfn commented 2 years ago

I proposed a fix in https://github.com/rdnfn/beobench_contrib/pull/2. With this fix the first call of reset() does not trigger a re-initialisation, and thus avoids the behaviour where the entire year-long simulation without inputs is run. Note that there may still be unexpected behaviour with further reset() calls -- I added a warning that is triggered on the second reset() call about potentially unexpected behaviour.

With this currently sufficient but not perfect fix I am closing this issue, but anybody should feel free to reopen it if they discover an issue with repeated reset() calls.

Edit:

Given that RLlib appears to sometimes call reset twice depending on the configuration, the Energym integration now has a gym_kwarg that allows to disable resetting functionality completely. To set it use the following config

env:
    config:
        gym_kwargs:
            ignore_reset: True