ugr-sail / sinergym

Gym environment for building simulation and control using reinforcement learning
https://ugr-sail.github.io/sinergym/
MIT License
127 stars 34 forks source link

[Bug]: Difference in reward calculation? #352

Closed kad99kev closed 1 year ago

kad99kev commented 1 year ago

Hello!

Thank you for the constant updates with Sinergym!

I recently updated my Sinergym version from 2.3.4 to 2.5.2 and I have been getting different results for the same model.

To reproduce this in a simple way, I created the following script.

import sinergym
import gymnasium as gym
import random

random.seed(0)

env = gym.make('Eplus-5Zone-hot-continuous-v1')
obs, info = env.reset(seed=0)
a = [-1, 1]
obs, reward, terminated, truncated, info = env.step(a)
print(reward)
print(info)

These are the outputs I get for the following versions: 2.3.4

{'timestep': 1, 'time_elapsed': 900, 'year': 1991, 'month': 1, 'day': 1, 'hour': 0, 'action': [15.0, 30.0], 'reward': -0.2074863442240327, 'reward_energy': -0.059453113841696555, 'reward_comfort': -0.35551957460636885, 'total_energy': 594.5311384169655, 'abs_comfort': 0.35551957460636885, 'temperatures': [19.64448042539363]}

2.5.2

{'timestep': 1, 'time_elapsed': 900, 'year': 1991, 'month': 1, 'day': 1, 'hour': 0, 'action': [15.0, 30.0], 'reward': -0.23150882355604996, 'reward_energy': -0.06186536390380958, 'reward_comfort': -0.40115228320829033, 'total_energy': 618.6536390380958, 'abs_comfort': 0.40115228320829033, 'temperatures': [19.59884771679171]}

Do I need to make any changes in the environment setup when I am updating the Sinergym version? I do not know why I am getting different rewards for the same environment and seed when I use different versions.

Any help would be greatly appreciated. Thank you!

AlejandroCN7 commented 1 year ago

Hi @kad99kev!

Thank you very much for reporting these kinds of things. In that version update you have made, you have mainly introduced two big changes: 1) working with JSON's instead of IDF's, and 2) updating the Energyplus simulation engine to its latest available version (v23.1.0). I understand that even if you use the same seed, changing these features makes the sequence of events in the simulation slightly different as you comment, this is not a bug. If you are looking for exact reproducibility of the data, I recommend you always do it with the same Sinergym version to get the desired results.

Regards.

kad99kev commented 1 year ago

Thank you so much for the explanation @AlejandroCN7! So the difference in simulation is related more to EnergyPlus and not Sinergym, that makes sense now.