ugr-sail / sinergym

Gym environment for building simulation and control using reinforcement learning
https://ugr-sail.github.io/sinergym/
MIT License
134 stars 36 forks source link

(v3.5.0) - Sinergym WandB wrapper migration and logging system back-end improvement #427

Closed AlejandroCN7 closed 2 months ago

AlejandroCN7 commented 2 months ago

Description

This PR arose from the need mentioned in issue #426; migrating from WandB to a tool wrapper instead of using a callback for SB3, and then integrating the additional information from that callback into the WandB session of the wrapper.

However, this led to a redesign of Sinergym's logging system, making it more comprehensive, simple, modular, and customizable/extensible.

As a result, the callback for logging SB3 data during training has been removed, leaving only the WandbOutputFormat logger available to apply to SB3's native logger, which integrates with the session already created by the wrapper.

Additionally, the evaluation callback has been improved. If the environment is wrapped by the WandB logger, it will write summarized data for all evaluated episodes to the session each time the event is triggered, and it will also incorporate the best model output from the training data, unifying everything.

Details about all the new functionality can be found in the documentation, which has been thoroughly updated, including the logging system, new wrappers, example notebooks, training, evaluations, and more.

Motivation and Context

Fixes #426

Types of changes

Checklist:

Changelog: