DLR-RM / rl-baselines3-zoo

A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
https://rl-baselines3-zoo.readthedocs.io
MIT License
1.98k stars 507 forks source link

[Bug] Problematic RL Baselines Zoo optimization information is contained in saved policies, breaking loading #196

Closed jkterry1 closed 2 years ago

jkterry1 commented 2 years ago

🐛 Bug

In this PR https://github.com/DLR-RM/rl-baselines3-zoo/pull/128 several months ago, functionality was added to log optimal policies during hyperparameter optimization. I've been using this feature to debug odd behaviors in various custom environment over the past several months (using distributed optimization), and recently I started pulling saved policies off the servers for further local analysis of physics bugs.

When I started to try and load these policies locally, I got errors that I needed to install Optuna, then mysqlclient, then that it couldn't connect to the IP address of the the server with the DB optuna used for distributed optimization. Suspecting that something was included when the pickled policies were saved that shouldn't have been, I uncompressed a best_model.zip file, and at the bottom of the data file, this was contained:

"study": "<optuna.study.study.Study object at 0x7fbab55d3d90>",
"_trial_id": 101,
"_study_id": 1,
"storage": "<optuna.storages._cached_storage._CachedStorage object at 0x7fbab519abd0>",
"relative_search_space": {},
"relative_params": {},
"n_actions": null,
"using_her_replay_buffer": false

So Optuna information that shouldn't be saved really is being saved. I'm happy to submit a PR to fix this, but after spending the past 30 minutes looking through code I'm not entirely clear how this could be done, so advice would be appreciated.

 System Info

Describe the characteristic of your environment:

I know I'm using a slightly older version of pytorch/SB3, but upgrading would take a lot of work on the cluster and there's been no relevant changes to this after checking.

araffin commented 2 years ago

Hello, as a hotfix, two things you can do:

I uncompressed a best_model.zip file, and at the bottom of the data file, this was contained:

We should probably solve the root cause which probably comes from that piece of code

https://github.com/DLR-RM/rl-baselines3-zoo/blob/c40cea6430d7482530bc94c395a3d633285e919c/utils/exp_manager.py#L630

which is probably not needed anymore...

PS: I'm moving that issue to the RL Zoo