Open albertochiappa opened 7 months ago
Thanks for considering the feature proposal. We're looking forward to hear your feedback!
Hello, thanks for the proposal. I still need to read the paper, but I would welcome anyway if you could do a PR for the project section of the documentation =) (combining Lattice exploration and MyoChallenge)
I still need to read the paper, but I would welcome anyway if you could do a PR for the project section of the documentation =) (combining Lattice exploration and MyoChallenge)
I have read the paper but I still need some time to process the content (I'll probably will be back with some questions), in the meantime, I would be happy to receive a PR that updates the project section.
🚀 Feature
I propose to include in Stable Baselines 3 an option to use Lattice exploration, an action noise that some colleagues and I have presented in this NeurIPS paper last year. Lattice introduces noise in the policy network before the last dense layer, making the action distribution a multivariate gaussian with full covariance matrix. It can improve the performance of SAC and PPO in high-dimensional environments with many actuators. In particular, we have been using it with success in the musculoskeletal simulation library MyoSuite, where we benchmarked it together with recurrent PPO and obtained good results:
We also tested together with SAC in the common PyBullet locomotion environments, where it is especially competitive in Humanoid:
It also powered our winning solution to the NeurIPS MyoChallenge 2023.
Motivation
It would be easier for the users of SB3 to test Lattice in their environment of choice if it is part of the library, vs installing a separate package or downloading another repository. The change does not break any of the current behavior of the library, as the feature is incremental.
Pitch
I have tried my best to integrate Lattice in SB3 modifying the codebase as little as possible. In the branch feature/lattice of this fork of SB3 I have implemented Lattice for SAC and PPO. It can be used by setting the argument "use_lattice=True" and passing additional hyperparameters in a dictionary called "lattice_kwargs". It seems to work correctly when called from the configuration files of SB3 zoo. I would invite a SB3 developer to check whether the integration I propose follows the library's guidelines and spirit. If you have no major concern, I would be happy to prepare a pull request!
Alternatives
Alternatively, Lattice could become part of the contrib repository of SB3. However, I don't see a way to implement it this way without creating entirely new algorithms (e.g., LatticePPO, LatticeSAC, …), which is, in my opinion, excessive, given that relatively limited changes have to be implemented in the original algorithms to enable this option.
Additional context
No response
Checklist