proposed fix for RunningMeanStd overflow - Githubissues

DLR-RM / stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

https://stable-baselines3.readthedocs.io

MIT License

8.35k stars 1.6k forks source link

proposed fix for RunningMeanStd overflow #1954

Open spiglerg opened 1 week ago

spiglerg commented 1 week ago

Connected to Issue https://github.com/DLR-RM/stable-baselines3/issues/1953

Description

RunningMeanStd is made robust to overflows with two modifications:

the product that can produce overflows when `count' becomes too large is split into a product of smaller quantities;
overflow exceptions are detected, and in that case normalization counts are rescaled, and the function is called again.

Motivation and Context

Issue https://github.com/DLR-RM/stable-baselines3/issues/1953
Closes #1953
[x] I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

[x] Bug fix (non-breaking change which fixes an issue)

Checklist

[x] I've read the CONTRIBUTION guide (required)
[ ] I have updated the tests accordingly (required for a bug fix or a new feature).
[x] I have reformatted the code using make format (required)
[x] I have checked the codestyle using make check-codestyle and make lint (required)
[x] I have ensured make pytest and make type both pass. (required)
[x] I have checked that the documentation builds using make doc (required)