[Enhancement]: WandB migration from callbacks to wrapper

Improvement 🔧

To address compatibility issues with Sinergym environments, it is proposed to migrate all real-time training registration functionality to a native Sinergym Wrapper, rather than implementing it by inheriting from the classes provided by Stable Baselines 3.

Environment interaction log in real-time should be as general as possible for better compatibility with the broader family of algorithms. Using the original callback with a WandB adapter only to log specific algorithm metrics during training.

This migration would simplify all functionality as there would be no need to adapt to the particularities and issues of SB3.

Checklist

[x] I have checked that there is no similar issue in the repo (required)
[x] I have read the documentation (required)

:pencil: Please, don't forget to include more labels besides enhancement if it is necessary.

ugr-sail / sinergym

[Enhancement]: WandB migration from callbacks to wrapper #426

Improvement 🔧

Checklist