Not able to save information after simulation

Hey @FilippoAiraldi,

I have adjusted the code to run RM and VSL combined and to use DDPG and SAC to do the training. Unfortunately for some reason after the training occurs it refuses to save the data. I am attaching the adjusted codes here in a zip file. I am also attaching the error that I get.

files.zip

[Simulated FA_DDPG_C_SIM2 with 15 agents]
[HighwayTrafficEnvC(SX), HighwayTrafficEnvC(SX), HighwayTrafficEnvC(SX), HighwayTrafficEnvC(SX), HighwayTrafficEnvC(SX), HighwayTrafficEnvC(SX), HighwayTrafficEnvC(SX), HighwayTrafficEnvC(SX), HighwayTrafficEnvC(SX), HighwayTrafficEnvC(SX), HighwayTrafficEnvC(SX), HighwayTrafficEnvC(SX), HighwayTrafficEnvC(SX), HighwayTrafficEnvC(SX), HighwayTrafficEnvC(SX)]
Traceback (most recent call last):
  File "C:\Users\prajw\PythonFiles\mpcrl-for-ramp-metering-review1\mpcrl-for-ramp-metering-review1\launch.py", line 382, in <module>
    launch_training(args_)
  File "C:\Users\prajw\PythonFiles\mpcrl-for-ramp-metering-review1\mpcrl-for-ramp-metering-review1\launch.py", line 165, in launch_training
    save_data(
  File "C:\Users\prajw\PythonFiles\mpcrl-for-ramp-metering-review1\mpcrl-for-ramp-metering-review1\util\io.py", line 168, in save_data
    info["envs"] = postprocess_env_data(data)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\prajw\PythonFiles\mpcrl-for-ramp-metering-review1\mpcrl-for-ramp-metering-review1\util\io.py", line 36, in postprocess_env_data
    for k, v in datum.finalized_step_infos(np.nan).items()
                ^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'HighwayTrafficEnvC' object has no attribute 'finalized_step_infos'

I added only the relevant files in the zip folder. The other files from the repository have remained unchanged which is why I find it a bit confusing. The above error occurs when I do not use the wrapper AugmentedObservationWrapper.

When I use the AugmentedObservationWrapper, the training does not occur for some reason and I get the following error:

Traceback (most recent call last):
  File "C:\Users\prajw\PythonFiles\mpcrl-for-ramp-metering-review1\mpcrl-for-ramp-metering-review1\launch.py", line 382, in <module>
    launch_training(args_)
  File "C:\Users\prajw\PythonFiles\mpcrl-for-ramp-metering-review1\mpcrl-for-ramp-metering-review1\launch.py", line 162, in launch_training
    data = Parallel(n_jobs=args.n_jobs)(delayed(fun)(i) for i in range(args.agents))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Anaconda3\Lib\site-packages\joblib\parallel.py", line 1085, in __call__
    if self.dispatch_one_batch(iterator):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Anaconda3\Lib\site-packages\joblib\parallel.py", line 901, in dispatch_one_batch
    self._dispatch(tasks)
  File "D:\Anaconda3\Lib\site-packages\joblib\parallel.py", line 819, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Anaconda3\Lib\site-packages\joblib\_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
             ^^^^^^^^^^^^^^^^^^^^^
  File "D:\Anaconda3\Lib\site-packages\joblib\_parallel_backends.py", line 597, in __init__
    self.results = batch()
                   ^^^^^^^
  File "D:\Anaconda3\Lib\site-packages\joblib\parallel.py", line 288, in __call__
    return [func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Anaconda3\Lib\site-packages\joblib\parallel.py", line 288, in <listcomp>
    return [func(*args, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\prajw\PythonFiles\mpcrl-for-ramp-metering-review1\mpcrl-for-ramp-metering-review1\launch.py", line 97, in fun
    return train_ddpg_c(
           ^^^^^^^^^^^^^
  File "C:\Users\prajw\PythonFiles\mpcrl-for-ramp-metering-review1\mpcrl-for-ramp-metering-review1\other_agents\ddpg_combined.py", line 250, in train_ddpg_c
    model.learn(total_timesteps=total_timesteps, log_interval=1, callback=cb)
  File "D:\Anaconda3\Lib\site-packages\stable_baselines3\td3\td3.py", line 222, in learn
    return super().learn(
           ^^^^^^^^^^^^^^
  File "D:\Anaconda3\Lib\site-packages\stable_baselines3\common\off_policy_algorithm.py", line 314, in learn
    total_timesteps, callback = self._setup_learn(
                                ^^^^^^^^^^^^^^^^^^
  File "D:\Anaconda3\Lib\site-packages\stable_baselines3\common\off_policy_algorithm.py", line 297, in _setup_learn
    return super()._setup_learn(
           ^^^^^^^^^^^^^^^^^^^^^
  File "D:\Anaconda3\Lib\site-packages\stable_baselines3\common\base_class.py", line 423, in _setup_learn
    self._last_obs = self.env.reset()  # type: ignore[assignment]
                     ^^^^^^^^^^^^^^^^
  File "D:\Anaconda3\Lib\site-packages\stable_baselines3\common\vec_env\vec_normalize.py", line 295, in reset
    obs = self.venv.reset()
          ^^^^^^^^^^^^^^^^^
  File "D:\Anaconda3\Lib\site-packages\stable_baselines3\common\vec_env\dummy_vec_env.py", line 77, in reset
    obs, self.reset_infos[env_idx] = self.envs[env_idx].reset(seed=self._seeds[env_idx], **maybe_options)
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Anaconda3\Lib\site-packages\stable_baselines3\common\monitor.py", line 83, in reset
    return self.env.reset(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Anaconda3\Lib\site-packages\gymnasium\core.py", line 467, in reset
    return self.env.reset(seed=seed, options=options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Anaconda3\Lib\site-packages\gymnasium\core.py", line 516, in reset
    return self.observation(obs), info
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\prajw\PythonFiles\mpcrl-for-ramp-metering-review1\mpcrl-for-ramp-metering-review1\other_agents\ddpg_combined.py", line 133, in observation
    assert self.observation_space.contains(new_state), "Invalid observation."
AssertionError: Invalid observation.

It is my understanding that some of these lines of code only work with the mpc-rl package which is why there are no agent plots for DDPG in the files and are only displayed for LSTDQ. Is this right or is there a way to extract the plots for stable-baselines3 based RL algorithms ?

Looking forward to your reply.

Thanks

Regards

Prajwal

FilippoAiraldi / mpcrl-for-ramp-metering

Not able to save information after simulation #4