(v3.1.5) - Sinergym reward function improvement and normalization

Description

This PR adds more modularization in reward function in order to inherit from base function reward and be easier to create a new function reward using that baseline. Taking advantage of this, a new reward function is created whose intention is to do the same as linear reward but applying a normalization instead of a manual magnitude calibration.

Motivation and Context

[ ] I have raised an issue to propose this change (required for new features and bug fixes)

Why is this change required? What problem does it solve? Please, reference issue or issues opened previously.

Fixes #(issue or issues)

Types of changes

[ ] Bug fix (non-breaking change which fixes an issue)
[ ] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
[ ] Documentation (update in the documentation)
[x] Improvement (of an existing feature)
[ ] Others

Checklist:

[x] I've read the CONTRIBUTION guide (required)
[x] My change requires a change to the documentation.
[ ] I have updated the tests.
[ ] I have updated the documentation accordingly.
[ ] I have reformatted the code using autopep8 second level aggressive.
[ ] I have reformatted the code using isort.
[ ] I have ensured cd docs && make spelling && make html pass (required if documentation has been updated.)
[ ] I have ensured pytest tests/ -vv pass. (required).
[ ] I have ensured pytype -d import-error sinergym/ pass. (required)

Changelog:

Updated reward function structure to be more modularized.
Created a new reward function: NormalizedLinearReward.
Updated default lambda energy value in reward functions.
Documentation: Updated reward section.
Environment configuration: Added all linear reward kwargs in JSON files.

ugr-sail / sinergym

(v3.1.5) - Sinergym reward function improvement and normalization #381

Description

Motivation and Context

Types of changes

Checklist:

Changelog: