How do we test the trained policy?

Unfortunately the current implementation does not have capability to save and restore trained network data. So I did a trick in the gym_energyplus/envs/energyplus_env.py file to allow changing weather data files dynamically.

First, please let me explain how weather data files are specified in ordinary case. Weather data file can be specified using an environment variable ENERGYPLUS_WEATHER. For example,

export ENERGYPLUS_WEATHER="${WEATHER_DIR}/USA_CA_San.Francisco.Intl.AP.724940_TMY3.epw"

You can also specify more than one weather file by specifying them as a list of comma separated file names. In such case, weather files are switched per simulation episode in round-robbin fashion of specified order.

Here is a trick. Before new simulation episode is started, the driver checks the files under the log directory ${ENERGYPLUS_LOGBASE}/openai-YYYY-MM-DD-HH-MM-SS-mmmmmm/. If there are any weather files (*.epw), these files supersede definition of ENERGYPLUS_WEATHER. So when you think network is well trained, you put, under the log directory, a copy of weather file(s) that you want to switch to.

This is very limited method, but I hope this can help.

IBM / rl-testbed-for-energyplus

How do we test the trained policy? #5