This PR adds more modularization in reward function in order to inherit from base function reward and be easier to create a new function reward using that baseline.
Taking advantage of this, a new reward function is created whose intention is to do the same as linear reward but applying a normalization instead of a manual magnitude calibration.
Motivation and Context
[ ] I have raised an issue to propose this change (required for new features and bug fixes)
Why is this change required? What problem does it solve? Please, reference issue or issues opened previously.
Fixes #(issue or issues)
Types of changes
[ ] Bug fix (non-breaking change which fixes an issue)
[ ] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
Description
This PR adds more modularization in reward function in order to inherit from base function reward and be easier to create a new function reward using that baseline. Taking advantage of this, a new reward function is created whose intention is to do the same as linear reward but applying a normalization instead of a manual magnitude calibration.
Motivation and Context
Why is this change required? What problem does it solve? Please, reference issue or issues opened previously.
Fixes #(issue or issues)
Types of changes
Checklist:
autopep8
second level aggressive.isort
.cd docs && make spelling && make html
pass (required if documentation has been updated.)pytest tests/ -vv
pass. (required).pytype -d import-error sinergym/
pass. (required)Changelog:
NormalizedLinearReward
.