IBM / rl-testbed-for-energyplus

Reinforcement Learning Testbed for Power Consumption Optimization using EnergyPlus
MIT License
177 stars 74 forks source link

The stochasticity of the environment #23

Closed vermouth1992 closed 5 years ago

vermouth1992 commented 5 years ago

I fed two identical group of actions to complete two episodes of the simulation using the same weather file and the states, rewards are identical. Shouldn't the simulation create some sort of stochasticity at least in the weather data? Otherwise, the trained RL agent may just memorize the observation as they passed at each episode and can't generalize to new data distribution.

antoine-galataud commented 5 years ago

@vermouth1992 you can train with more than 1 weather file. See https://energyplus.net/weather to download some (there are also a few bundled with EnergyPlus installation package). Making the weather more stochastic isn't as simple as adding noise to dry-bulb temperature for instance, as there are dependencies between variables.