[Feature]: "Weights & Biases" integration

manjavacas commented 2 years ago

Feature 🚀

The integration of the "Weights & Biases" (wandb) library in Sinergym.

Motivation

This library may be specially interesting, as it integrates both the functionalities of TensorBoard and MLflow.

As explained by @jajimer:

It is a standalone package (no Tensorflow or Pytorch installation is needed).
It looks like it can be easily integrated with any script or framework. See this.
Integrated with SB3!

Solution

Deeply study how wandb could be integrated, even considering if it can completely replace TensorBoard and MLflow or not.

Checklist

[X] I have checked that there is no similar issue in the repo (required)

:pencil: Please, don't forget to include more labels besides Feature request if it is necessary.

kad99kev commented 2 years ago

Hi,

Firstly, thank you for your work on this library.

I have a potential solution to have wandb working with sinergym and sb3. There isn't much to be changed from the example given for MLFlow here.

I have added a minimal working example.

import sys
import gym
import wandb
import numpy as np

from typing import Any, Dict, Tuple, Union

import sinergym
from sinergym.utils.wrappers import LoggerWrapper, NormalizeObservation
from sinergym.utils.constants import RANGES_5ZONE
from sinergym.utils.callbacks import LoggerCallback

from stable_baselines3 import PPO
from stable_baselines3.common.logger import HumanOutputFormat, Logger, KVWriter

# The only addition required.
class WandBOutputFormat(KVWriter):
    """
    Dumps key/value pairs onto WandB.
    """

    def write(
        self,
        key_values: Dict[str, Any],
        key_excluded: Dict[str, Union[str, Tuple[str, ...]]],
        step: int = 0,
    ) -> None:

        for (key, value), (_, excluded) in zip(
            sorted(key_values.items()), sorted(key_excluded.items())
        ):

            if excluded is not None and "wandb" in excluded:
                continue

            if isinstance(value, np.ScalarType):
                if not isinstance(value, str):
                    wandb.log({key: value}, step=step)

# Init wandb here.
wandb.init(
        project="project_name",
        entity="entity_name",
        name="random_agent_" + wandb.util.generate_id(),
    )

env = gym.make('Eplus-5Zone-hot-continuous-v1')
env = NormalizeObservation(env, ranges=RANGES_5ZONE)
env = LoggerWrapper(env)

loggers = Logger(
    folder=None,
    output_formats=[HumanOutputFormat(sys.stdout, max_length=90), WandBOutputFormat()],
)

model = PPO("MlpPolicy", env, verbose=2)
# Set custom logger
model.set_logger(loggers)
model.learn(total_timesteps=10_000, log_interval=1, callback=LoggerCallback(sinergym_logger=True))

Please do let me know if this helps with this issue. Let me know if I should make a PR or if I could help in another way. Thanks!

AlejandroCN7 commented 2 years ago

Hi @kad99kev ! I've been testing the code you gave us and it works! The point is that our LoggerCallback doesn't work for wandb, I don't know if you got that error:

[2022-11-16 09:22:44,069] EPLUS_ENV_5Zone-hot-continuous-v1_MainThread_ROOT INFO:Updating idf ExternalInterface object if it is not present...
[2022-11-16 09:22:44,069] EPLUS_ENV_5Zone-hot-continuous-v1_MainThread_ROOT INFO:Updating idf Site:Location and SizingPeriod:DesignDay(s) to weather and ddy file...
[2022-11-16 09:22:44,071] EPLUS_ENV_5Zone-hot-continuous-v1_MainThread_ROOT INFO:Updating idf OutPut:Variable and variables XML tree model for BVCTB connection.
[2022-11-16 09:22:44,072] EPLUS_ENV_5Zone-hot-continuous-v1_MainThread_ROOT INFO:Setting up extra configuration in building model if exists...
[2022-11-16 09:22:44,072] EPLUS_ENV_5Zone-hot-continuous-v1_MainThread_ROOT INFO:Setting up action definition in building model if exists...
/usr/local/lib/python3.10/dist-packages/gym/spaces/box.py:73: UserWarning: WARN: Box bound precision lowered by casting to float32
  logger.warn(
Using cpu device
Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.
[2022-11-16 09:22:44,097] EPLUS_ENV_5Zone-hot-continuous-v1_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2022-11-16 09:22:44,106] EPLUS_ENV_5Zone-hot-continuous-v1_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/scripts/Eplus-env-5Zone-hot-continuous-v1-res1/Eplus-env-sub_run1
Traceback (most recent call last):
  File "/workspaces/sinergym/prueba.py", line 66, in <module>
    model.learn(
  File "/usr/local/lib/python3.10/dist-packages/stable_baselines3/ppo/ppo.py", line 310, in learn
    return super().learn(
  File "/usr/local/lib/python3.10/dist-packages/stable_baselines3/common/on_policy_algorithm.py", line 265, in learn
    self.logger.dump(step=self.num_timesteps)
  File "/usr/local/lib/python3.10/dist-packages/stable_baselines3/common/logger.py", line 486, in dump
    _format.write(self.name_to_value, self.name_to_excluded, step)
  File "/usr/local/lib/python3.10/dist-packages/stable_baselines3/common/logger.py", line 183, in write
    raise ValueError(
ValueError: Key '   Cooling_Setpoint_RL' truncated to '   Cooling_Setpoint_RL' that already exists. Consider increasing `max_length`.

Then, I removed the callback and we can only register this params currently: Captura desde 2022-11-16 09-49-47

Do you know why our Loggercallback is not compatible with wandb? Or did the code you put in work for you? If you know how to solve it, we can see how to integrate it in a PR made by you :)

kad99kev commented 2 years ago

Hi @AlejandroCN7,

I got that error as well. It is because when creating the keys, large names are truncated which often results in duplicate keys. Hence, that error. To solve that I had changed it to max_length=90.

I believe adding this would help solve the error:

loggers = Logger(
    folder=None,
    output_formats=[HumanOutputFormat(sys.stdout, max_length=90), WandBOutputFormat()],
)

Let me know if you already added this. I would be happy to contribute with a PR, let me know what API structure you have in mind and I can work on it. Thanks!

Here is the link to my WandB project - https://wandb.ai/kad99kev/sinergym-tests?workspace=user-kad99kev

AlejandroCN7 commented 2 years ago

Hi @kad99kev

Yes, I had the max_length added as you mentioned, but I still get the same error. I've tried to increase that value more, but I still get the same result.

In any case, looking at your project, I see that it works! It's great because I see that it doesn't need any big changes to make it work, thank you very much!

Let's see if I can make it work for me too and plain how we can integrate it into the tool (maybe substitute tensorboard+mlflow by wandb in DRL_battery.py), once again thank you very much for everything! :)

kad99kev commented 2 years ago

Hi @AlejandroCN7,

That is indeed weird. I removed the HumanOutputFormat completely and it still worked. I only got that error the first time I ran it. I will look into that sometime. I will see if I can get the wandb example to work with DRL_battery.py and then get back to you.

Thank you!

kad99kev commented 1 year ago

Hi @AlejandroCN7,

Sorry for the delay in getting back with this. I have some code ready, I just need to run a few more experiments with different algorithms to make sure it is working correctly. When it is done, should I make a PR or would you like me to share the code another way?

Link to WandB project - https://wandb.ai/kad99kev/sinergym_tests?workspace=user-kad99kev

Thanks!