DLR-RM / stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
https://stable-baselines3.readthedocs.io
MIT License
8.84k stars 1.68k forks source link

Remove unnecessary SDE resampling in PPO update #1933

Closed brn-dev closed 3 months ago

brn-dev commented 4 months ago

Description

Remove policy.reset_noise() call in PPO update

Motivation and Context

Resampling the SDE noise in the PPO update is unnecessary, for more info see https://github.com/DLR-RM/stable-baselines3/issues/1929

closes #1929

Types of changes

Checklist

Note: You can run most of the checks using make commit-checks.

Note: we are using a maximum length of 127 characters per line

araffin commented 3 months ago

I've created a report to check that it had no impact on the performance (it changes results because the state of the pseudo-random generator is not the same but should not impact performance): https://wandb.ai/openrlbenchmark/sb3/reports/PR-1933-Remove-gSDE-resampling--Vmlldzo4NDk4Nzgx