Add support for stable baselines3

As a CityLearn user, I want to be able to take advantage of stable baselines3 reliable implementations of RL algorithms to enable me easily evaluate my environment on a diverse set of algorithms and benchmark the performance of the algorithms.

Changes can be made to the environment as long as that the evaluation criteria below are met.

Acceptance Criteria

[ ] Setup works for the RL algorithms that make use of Box gym.space.
[ ] Setup works for n building environment when env.central_agent = True (single agent controls all buildings)
[ ] Setup works for n building environment when env.central_agent = False (independent multi-agent i.e. each building has its own agent and agents do no share information)
[ ] Setup does not disrupt the compatibility of the environment with CityLearn’s RBC, SAC, and MARLISA implementations in citylearn/agents.
[ ] The test_environment.py module runs without error.
[ ] The example.ipynb notebook runs without error.
[ ] The example.ipynb notebook provides an example implementation of using at least on of the Stable Baselines3 algorithms for n buildings in central and non-central agent scenarios.

intelligent-environments-lab / CityLearn

Add support for stable baselines3 #38

Acceptance Criteria

References