Fail to save model after training Basic_Run.py

I'm running on Ubuntu 22.04 LTS, and I got all the things set up and ready to train my policy, all 20 iterations work fine but when it ended, I receive an error detailing that the model is failed to be save via pickle.(P.S all the server config is setted up as reccomended and I use the reccomeded stable baseline3 (git clone https://github.com/m-abr/Adaptive-Symmetry-Learning stable-baselines3)

The error is shown below:

Traceback (most recent call last): File "/workspaces/FCPCodebase/Run_Utils.py", line 93, in main() File "/workspaces/FCPCodebase/Run_Utils.py", line 81, in main mod.Train(script).train(dict()) File "/workspaces/FCPCodebase/scripts/gyms/Basic_Run.py", line 232, in train model_path = self.learn_model( model, total_steps, model_path, eval_env=eval_env, eval_freq=n_steps_per_env20, save_freq=n_steps_per_env200, backup_env_file=file ) File "/workspaces/FCPCodebase/scripts/commons/Train_Base.py", line 278, in learn_model model.learn( total_timesteps=total_steps, callback=callbacks ) File "/workspaces/stable-baselines3/stable_baselines3/ppo/ppo.py", line 345, in learn return super().learn( File "/workspaces/stable-baselines3/stable_baselines3/common/on_policy_algorithm.py", line 248, in learn continue_training = self.collect_rollouts(self.env, callback, self.rollout_buffer, n_rollout_steps=self.n_steps) File "/workspaces/stable-baselines3/stable_baselines3/common/on_policy_algorithm.py", line 182, in collect_rollouts if callback.on_step() is False: File "/workspaces/stable-baselines3/stable_baselines3/common/callbacks.py", line 89, in on_step return self._on_step() File "/workspaces/stable-baselines3/stable_baselines3/common/callbacks.py", line 193, in _on_step continue_training = callback.on_step() and continue_training File "/workspaces/stable-baselines3/stable_baselines3/common/callbacks.py", line 89, in on_step return self._on_step() File "/workspaces/stable-baselines3/stable_baselines3/common/callbacks.py", line 491, in _on_step self.model.save(os.path.join(self.best_model_save_path, "best_model")) File "/workspaces/stable-baselines3/stable_baselines3/common/base_class.py", line 837, in save save_to_zip_file(path, data=data, params=params_to_save, pytorch_variables=pytorch_variables) File "/workspaces/stable-baselines3/stable_baselines3/common/save_util.py", line 309, in save_to_zip_file serialized_data = data_to_json(data) File "/workspaces/stable-baselines3/stable_baselines3/common/save_util.py", line 99, in data_to_json base64_encoded = base64.b64encode(cloudpickle.dumps(data_item)).decode() File "/workspaces/FCPCodebase/fcp/lib/python3.10/site-packages/cloudpickle/cloudpickle.py", line 1479, in dumps cp.dump(obj) File "/workspaces/FCPCodebase/fcp/lib/python3.10/site-packages/cloudpickle/cloudpickle.py", line 1245, in dump return super().dump(obj) File "/usr/lib/python3.10/multiprocessing/process.py", line 353, in reduce raise TypeError( TypeError: Pickling an AuthenticationString object is disallowed for security reasons

Hi Allen,

The issue is that the modified stable-baselines3 version at https://github.com/m-abr/Adaptive-Symmetry-Learning lacks support for multi-process environments. To provide some context, the problem arises during the model-saving process. The saved model includes a 'sym' object, which in turn contains a nested 'sym.ppo.env.processes' object from the 'multiprocessing' class. The nested object holds an authentication key that cannot be pickled due to security reasons.

There are 2 solutions:

A) Exclude "sym" from saved parameters

In the cloned stable-baselines3 folder, go to /stable-baselines3/stable_baselines3/common/base_class.py and add the string "sym" to the list between lines 315 and 326, as shown below:

return [
            "policy",
            "device",
            "env",
            "eval_env",
            "replay_buffer",
            "rollout_buffer",
            "_vec_normalize_env",
            "_episode_storage",
            "_logger",
            "_custom_logger",
            "sym"
        ]

Explanation: The symmetry related parameters are not saved with the model to avoid triggering the pickle error. These parameters do not need to be saved unless the user wants to retrain a model and the robot/environment is not perfectly symmetric (which implies that the symmetry mappings are adapted during the learning process). Since the NAO robot in SimSpark is perfectly symmetric, excluding the 'sym' parameter from saved models does not have any negative effect.

B) Delete the modified stable-baselines3 and install the most recent version:

pip3 install stable-baselines3 gym shimmy

Explanation: Using the latest stable-baselines3 is probably better in terms of performance/stability. It's important to note that using the modified stable-baselines3 from https://github.com/m-abr/Adaptive-Symmetry-Learning has no advantage unless the user defines the symmetry mappings for the NAO robot.

As mentioned in the documentation, using the modified stable-baselines3 is an advanced option. I recommend solution B for anyone that is starting. I am currently changing the documentation to make this clearer.

m-abr / FCPCodebase