ibpsa / project1-boptest-gym

Other
43 stars 20 forks source link

FMU Simulation Failed: Single-Zone Commercial Hydronic #144

Open IamAniket12 opened 2 months ago

IamAniket12 commented 2 months ago

I'm running the run_vectorized.py script from the example folder on a different test case locally. The script executes successfully for a number of iterations, but then throws a 'simulation failed' error. I tested the same script on the bestest_hydronic_pump test case, which ran without errors. Please review the code and error message provided below.

def generate_urls_from_yml(boptest_root_dir):
    """Method that returns as many URLs for BOPTEST-Gym environments
    as those specified at the BOPTEST `docker-compose.yml` file.
    It assumes that `generateDockerComposeYml.py` has been called first.

    Parameters
    ----------
    boptest_root_dir: str
        String with directory to BOPTEST where the `docker-compose.yml`
        file should be located.

    Returns
    -------
    urls: list
        List of URLs where BOPTEST test cases will be allocated.
    """
    docker_compose_loc = os.path.join(boptest_root_dir, "docker-compose.yml")

    # Read the docker-compose.yml file
    with open(docker_compose_loc, "r") as stream:
        try:
            docker_compose_data = yaml.safe_load(stream)
            services = docker_compose_data.get("services", {})

            # Extract the port and URL of the service
            urls = []
            for service, config in services.items():
                ports = config.get("ports", [])
                for port in ports:
                    # Extract host port
                    host_port = port.split(":")[1]
                    urls.append(f"http://127.0.0.1:{host_port}")

            print(urls)  # Print URLs

        except yaml.YAMLError as exc:
            print(exc)

    return urls

def make_env(url):
    """Function that instantiates the environment.
    Parameters
    ----------
    url: string
        Rest API URL for communication with this environment.
    """
    def _init():
        env = BoptestGymEnvCustomReward(
            url=url,
            actions=["oveValCoi_u"],
            observations={
                "time": (0, 31536000),
                "reaTZon_y": (280.0, 310.0),
                "reaCO2Zon_y": (200.0, 2000.0),
                "PriceElectricPowerHighlyDynamic": (-0.4, 0.4),
                "LowerSetp[1]": (280.0, 310.0),
                "UpperSetp[1]": (280.0, 310.0),
                "UpperCO2[1]": (0, 10000),
            },
            scenario={"electricity_price": "highly_dynamic"},
            predictive_period=24 * 3600,
            random_start_time=True,
            excluding_periods=[
                (173*24*3600, 266*24*3600)
            ],
            max_episode_length=14*24*3600,
            step_period=3600,
            warmup_period=7*24*3600
        )
        env = NormalizedObservationWrapper(env)  # Add observation normalization if needed
        # env = DiscretizedActionWrapper(env, n_bins_act=15)  # Add action discretization if needed
        return env

    return _init

def train_DQN_vectorized(
    venv,
    log_dir=os.path.join("results", "PPO", "V1"),
    tensorboard_log=os.path.join("results", "PPO", "V1"),
):
    """Method to train PPO agent using vectorized environment.

    Parameters
    ----------
    venv: stable_baselines3.common.vec_env.DummyVecEnv
        Vectorized environment to be learned.
    """
    # Create logging directory if not exists. Monitoring data and agent model will be stored here
    os.makedirs(log_dir, exist_ok=True)
    env_config = {
        "url": "url",
        "actions": ["oveValCoi_u"],
        "observations": {
            "time": [0, 31536000],
            "reaTZon_y": [280.0, 310.0],
            "reaCO2Zon_y": [200.0, 2000.0],
            "PriceElectricPowerHighlyDynamic": [-0.4, 0.4],
            "LowerSetp[1]": [280.0, 310.0],
            "UpperSetp[1]": [280.0, 310.0],
            "UpperCO2[1]": (0, 10000),
        },
        "scenario": {"electricity_price": "highly_dynamic"},
        "predictive_period": 24 * 3600,
        "random_start_time": "true",
        "excluding_periods": [[173*24*3600, 266*24*3600]],
        "max_episode_length": 14*24*3600,
        "step_period": 3600,
        "action_space": "continuous",
        "warmup_period": 7*24*3600
    }

    # Modify the environment to include the callback
    venv = VecMonitor(venv=venv, filename=os.path.join(log_dir, "monitor.csv"))
    run = wandb.init(
        project="PPO",  # Replace with your project name
        sync_tensorboard=True,  # Auto-sync with TensorBoard
        config=env_config,
        name="V1",
        id="676",
        resume="allow",
    )
    print(run.id)

    # Create the callback: evaluate with one episode after 100 steps for training. We keep it very short for testing.
    eval_freq = 2000
    eval_callback = EvalCallback(
        venv,
        best_model_save_path=log_dir,
        log_path=log_dir,
        eval_freq=int(eval_freq / venv.num_envs),
        n_eval_episodes=1,
        deterministic=True,
    )
    wandb_callback = WandbCallback(
        model_save_path=log_dir,
        model_save_freq=2000,
        verbose=2,
    )
    callback = SaveAndTestCallback(
        venv, check_freq=1500, save_freq=1500, log_dir=log_dir, test=False
    )

    # Try to find CUDA core since it's optimized for parallel computing tasks
    device = "cuda" if torch.cuda.is_available() else "cpu"
    print(device)

    # Instantiate an RL agent with PPO
    model = PPO(
        "MlpPolicy",
        venv,
        learning_rate=3e-4,
        n_steps=336,
        batch_size=64,
        n_epochs=10,
        gamma=0.99,
        gae_lambda=0.95,
        clip_range=0.2,
        ent_coef=0.01,
        vf_coef=0.5,
        max_grad_norm=0.5,
        tensorboard_log=tensorboard_log,
        verbose=1,
        device=device,  # Use the appropriate device
    )

    # Set up logger with TensorBoard logging continuation
    new_logger = configure(log_dir, ["stdout", "csv", "tensorboard"])
    model.set_logger(new_logger)

    # Main training loop
    model.learn(
        total_timesteps=10000000, callback=[eval_callback, wandb_callback, callback]
    )

if __name__ == "__main__":
    boptest_root = "/home/aniket/Desktop/Codes/HVAC_Training/project1-boptest"

    # Get the argument from command line when use Linux
    boptest_root_dir = boptest_root

    # Use URLs obtained from docker-compose.yml
    urls = generate_urls_from_yml(boptest_root_dir=boptest_root_dir)

    # Create BOPTEST-Gym environment replicas
    envs = [make_env(url) for url in urls]

    # Create a vectorized environment using DummyVecEnv
    venv = SubprocVecEnv(envs)

    # Train vectorized environment
    train_DQN_vectorized(venv)

Server log:

boptest5000_1  |     res = self.fmu.simulate(start_time=start_time,
boptest5000_1  |   File "src/pyfmi/fmi.pyx", line 7573, in pyfmi.fmi.FMUModelCS2.simulate
boptest5000_1  |   File "src/pyfmi/fmi.pyx", line 378, in pyfmi.fmi.ModelBase._exec_simulate_algorithm
boptest5000_1  |   File "src/pyfmi/fmi.pyx", line 374, in pyfmi.fmi.ModelBase._exec_simulate_algorithm
boptest5000_1  |   File "/home/user/miniconda/envs/pyfmi3/lib/python3.10/site-packages/pyfmi/fmi_algorithm_drivers.py", line 1065, in solve
boptest5000_1  |     result_handler.integration_point()
boptest5000_1  |   File "/home/user/miniconda/envs/pyfmi3/lib/python3.10/site-packages/pyfmi/common/io.py", line 2650, in integration_point
boptest5000_1  |     self.dump_data_internal.save_point()
boptest5000_1  |   File "src/pyfmi/fmi_util.pyx", line 1087, in pyfmi.fmi_util.DumpData.save_point
boptest5000_1  |   File "src/pyfmi/fmi.pyx", line 4203, in pyfmi.fmi.FMUModelBase2.get_real
boptest5000_1  |   File "src/pyfmi/fmi.pyx", line 4235, in pyfmi.fmi.FMUModelBase2.get_real
boptest5000_1  | pyfmi.fmi.FMUException: Failed to get the Real values.
boptest5000_1  | .
IamAniket12 commented 2 months ago

I attempted to run the singlezone_commercial_hydronic testcase using the quick start code on the server, but encountered the same error.

Code:

from boptestGymEnv import (BoptestGymEnv, NormalizedActionWrapper,
                           NormalizedObservationWrapper, SaveAndTestCallback,DiscretizedActionWrapper)
# url for the BOPTEST service. 
url = 'https://api.boptest.net' 

# Decide the state-action space of your test case
env = BoptestGymEnvCustomReward(
            url=url,
            testcase="singlezone_commercial_hydronic",
            actions=["oveValCoi_u"],
            observations={
                "time": (0, 31536000),
                "reaTZon_y": (280.0, 310.0),
                "reaCO2Zon_y": (200.0, 2000.0),
                "PriceElectricPowerHighlyDynamic": (-0.4, 0.4),
                "LowerSetp[1]": (280.0, 310.0),
                "UpperSetp[1]": (280.0, 310.0),
                "UpperCO2[1]": (0, 10000),
            },
            scenario={"electricity_price": "highly_dynamic"},
            predictive_period=24 * 3600,
            random_start_time=True,
            excluding_periods=[
                (173*24*3600, 266*24*3600)
            ],
            max_episode_length=14*24*3600,
            step_period=3600,
            warmup_period=7*24*3600
        )
env = NormalizedObservationWrapper(env)

# Normalize observations and discretize action space
env = NormalizedObservationWrapper(env)
env = DiscretizedActionWrapper(env,n_bins_act=20)

# Instantiate an RL agent
model = DQN('MlpPolicy', env, verbose=1, gamma=0.99,
            learning_rate=5e-4, batch_size=24, 
            buffer_size=365*24, learning_starts=24, train_freq=1)

# Main training loop
model.learn(total_timesteps=1000)

# Loop for one episode of experience (one day)
done = False
obs, _ = env.reset()
while not done:
  action, _ = model.predict(obs, deterministic=True) 
  obs,reward,terminated,truncated,info = env.step(action)
  done = (terminated or truncated)

# Obtain KPIs for evaluation
env.get_kpis()
dhblum commented 2 months ago

Can you report the input you were overwriting and the value you overwrote at the step the simulation failed? Even better would be the sequence of values from the start of your simulation to the time the simulation stops.

Noting that there are a couple open issues in the boptest repo (https://github.com/ibpsa/project1-boptest/issues/432 and https://github.com/ibpsa/project1-boptest/issues/635) related to this test case which could cause issues in the simulation numerics in certain situations of control. Have not addressed those issues yet but still on the list. These may or may not be related to your issue.

IamAniket12 commented 2 months ago

I experimented with the variables ahu_oveFanSup_u, oveValCoi_u, and oveValRad_u. I tried controlling them together and separately, as well as in continuous and discrete modes. The simulation runs for a variable number of iterations - sometimes it works for thousands of iterations before failing. However, it eventually encounters an error in all cases. Interestingly, in the single_commercial_hydronic testcase, the code works fine when I attempt to control oveTSupSet_u, oveTZonSet_u, and oveCO2ZonSet_u.

Here are the last four overwritten inputs.

{'ahu_oveFanSup_u': 1.0, 'ahu_oveFanSup_activate': 1.0, 'oveValCoi_u': 1.0, 'oveValCoi_activate': 1.0, 'oveValRad_u': 0.21339374780654907, 'oveValRad_activate': 1.0}

{'ahu_oveFanSup_u': 0.0, 'ahu_oveFanSup_activate': 1.0, 'oveValCoi_u': 0.07432644069194794, 'oveValCoi_activate': 1.0, 'oveValRad_u': 1.0, 'oveValRad_activate': 1.0}

{'ahu_oveFanSup_u': 0.2755994498729706, 'ahu_oveFanSup_activate': 1.0, 'oveValCoi_u': 0.0, 'oveValCoi_activate': 1.0, 'oveValRad_u': 0.0, 'oveValRad_activate': 1.0}

{'ahu_oveFanSup_u': 0.0, 'ahu_oveFanSup_activate': 1.0, 'oveValCoi_u': 0.0, 'oveValCoi_activate': 1.0, 'oveValRad_u': 0.0, 'oveValRad_activate': 1.0}

dhblum commented 2 months ago

Thanks for those values. We'll consider this issue with the others regarding this test case.

Do you mean your code works fine for bestest_hydronic_heat_pump? That is a different system representation with different model equations and numerics than this one, and is a bit simpler. So might simulate more robustly to different control inputs at the moment.

IamAniket12 commented 2 months ago

Thank you for your response. Yes, I have run the provided script from the examples folder for bestest_hydronic_heat_pump on a local server, and it worked.

dhblum commented 2 months ago

Oh I'm re-reading your comment. The code might work fine for controlling using oveTSupSet_u, oveTZonSet_u, and oveCO2ZonSet_u because those are set points, which serve embedded feedback controllers that are responsible for controlling actuators (valves and fans). But controlling the set of ahu_oveFanSup_u, oveValCoi_u, and oveValRad_u are controlling the actuators themselves and may bypass safety measures present in embedded feedback controllers. Would need to check exactly what's going on in this case. But generally, that's a risk in a real building too. You might have more controllability directly controlling actuators, but that could bypass safeties that prevent things like equipment cycling, turning on a pump when all valves are closed, freezing cooling coils due to low air flow, etc.

IamAniket12 commented 2 months ago

Thanks for clarifying, it makes sense.

IamAniket12 commented 2 months ago

I tried using oveTSupSet_u, oveTZonSet_u, and oveCO2ZonSet_u, but the issue still persists. The server crashed with the same error after 400k time steps.

dhblum commented 2 months ago

How long is each time step and what's your starting time? Better to think in terms of time since then we can diagnose with help from time of day (maybe related to schedules), time of year (maybe related to climate), etc.

IamAniket12 commented 2 months ago

Here are the environment details I tested.


env = BoptestGymEnvCustomReward(
            url=url,
            actions=["oveCO2ZonSet_u", "oveTSupSet_u", "oveTZonSet_u"],
            observations={
                "time": (0, 31536000),
                "reaTZon_y": (280.0, 310.0),
                "reaCO2Zon_y": (200.0, 2000.0),
                "PriceElectricPowerHighlyDynamic": (-0.4, 0.4),
                "LowerSetp[1]": (280.0, 310.0),
                "UpperSetp[1]": (280.0, 310.0),
                "UpperCO2[1]": (0, 10000),
                "TDryBul": (280.0, 320.0),
            },
            scenario={"electricity_price": "highly_dynamic"},
            predictive_period=24 * 3600,
            random_start_time=True,
            excluding_periods=[(173 * 24 * 3600, 266 * 24 * 3600)],
            max_episode_length=14 * 24 * 3600,
            step_period=3600
        )```
dhblum commented 2 months ago

@javiarrobas Is it true that step_period is control step, which looks like is set to 3600s? Therefore, is it true that @IamAniket12's server crash at 400k steps is at 45.6 years of simulation time? If this is true, 1) that seems like an unreasonable amount of time and 2) we've never tried to run models that long and produce that much data - could this be a memory or hard disk error and not a numerical/simulation error?

IamAniket12 commented 2 months ago

400k isn't a consistent threshold; sometimes it crashes even before reaching 10k steps.

javiarrobas commented 2 months ago

@dhblum you're right, step_period is the control step. I agree these are unreasonable times, but that's what RL demands for training and I've used them in the past as well. As BOPTEST-Gym is configured it does not warn users of those extremely long training times. That is intentional so that we can investigate the performance of RL in its best-case scenario where we assume that a lot of data is available. Even then it's hard to compete against other controllers like MPC. Regarding the error, I'm not sure whether it relates to a memory error or a simulation error. I think it could be both. We reset the test case at the end of every episode of experience (after a simulation of 14 days in this case). However, it's likely that there is a memory leak somewhere. This is probably not detected by the bestest_hydronic_heat_pump because it's a simpler test case. @IamAniket12 you could detect whether it relates to a memory error by using some kind of memory profiler.