This PR arose from the need mentioned in issue #426; migrating from WandB to a tool wrapper instead of using a callback for SB3, and then integrating the additional information from that callback into the WandB session of the wrapper.
However, this led to a redesign of Sinergym's logging system, making it more comprehensive, simple, modular, and customizable/extensible.
As a result, the callback for logging SB3 data during training has been removed, leaving only the WandbOutputFormat logger available to apply to SB3's native logger, which integrates with the session already created by the wrapper.
Additionally, the evaluation callback has been improved. If the environment is wrapped by the WandB logger, it will write summarized data for all evaluated episodes to the session each time the event is triggered, and it will also incorporate the best model output from the training data, unifying everything.
Details about all the new functionality can be found in the documentation, which has been thoroughly updated, including the logging system, new wrappers, example notebooks, training, evaluations, and more.
Motivation and Context
[x] I have raised an issue to propose this change (required for new features and bug fixes)
Fixes #426
Types of changes
[ ] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
Description
This PR arose from the need mentioned in issue #426; migrating from WandB to a tool wrapper instead of using a callback for SB3, and then integrating the additional information from that callback into the WandB session of the wrapper.
However, this led to a redesign of Sinergym's logging system, making it more comprehensive, simple, modular, and customizable/extensible.
As a result, the callback for logging SB3 data during training has been removed, leaving only the WandbOutputFormat logger available to apply to SB3's native logger, which integrates with the session already created by the wrapper.
Additionally, the evaluation callback has been improved. If the environment is wrapped by the WandB logger, it will write summarized data for all evaluated episodes to the session each time the event is triggered, and it will also incorporate the best model output from the training data, unifying everything.
Details about all the new functionality can be found in the documentation, which has been thoroughly updated, including the logging system, new wrappers, example notebooks, training, evaluations, and more.
Motivation and Context
Fixes #426
Types of changes
Checklist:
autopep8
second level aggressive.isort
.cd docs && make spelling && make html
pass (required if documentation has been updated.)pytest tests/ -vv
pass. (required).pytype -d import-error sinergym/
pass. (required)Changelog: