(v3.2.1) - Sinergym reward function improvement; new reward terms and metrics in CSV Logger and Callbacks.

Description

This update enhances the modularization of the reward calculation process, introducing additional terms to the reward and info dictionaries returned by the environment.

Additionally, CSVLogger names have been refined, and these new metrics are now included. Corresponding adjustments have been made to the training and evaluation logging callbacks for DRL algorithms.

In essence, the reward now distinguishes more effectively between absolute values of energy and comfort violation, their respective absolute penalties, and the weighted terms summed in the reward. This enables better adaptation and facilitates the creation of new rewards inheriting from it.

Reward section has been improved in documentation, with new diagrams.

Types of changes

[ ] Bug fix (non-breaking change which fixes an issue)
[ ] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
[x] Documentation (update in the documentation)
[x] Improvement (of an existing feature)
[ ] Others

Checklist:

[x] I've read the CONTRIBUTION guide (required)
[x] My change requires a change to the documentation.
[ ] I have updated the tests.
[x] I have updated the documentation accordingly.
[ ] I have reformatted the code using autopep8 second level aggressive.
[ ] I have reformatted the code using isort.
[ ] I have ensured cd docs && make spelling && make html pass (required if documentation has been updated.)
[ ] I have ensured pytest tests/ -vv pass. (required).
[ ] I have ensured pytype -d import-error sinergym/ pass. (required)

Changelog:

modularization of the reward calculation process enhancement.
New reward terms.
CSVLogger update with these new terms.
Callback logger and evaluation updated too.
Improvement of reward section in documentation.
Fixed exponential reward bug: When temperature zone is in range comfort, now return 0 instead of 1.
Autobalance building action space update.

ugr-sail / sinergym