hill-a / stable-baselines

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
http://stable-baselines.readthedocs.io/
MIT License
4.16k stars 725 forks source link

PPO2 assertion that should probably be a warning #1094

Closed jkterry1 closed 3 years ago

jkterry1 commented 3 years ago

Describe the bug PPO2 includes the following code snippet:

assert self.n_batch % self.nminibatches == 0, ("The number of minibatches (`nminibatches`) "
                                               "is not a factor of the total number of samples "
                                               "collected per rollout (`n_batch`), "
                                               "some samples won't be used."
                                               )

Having this be an assert statement causes code to be unable to run, even though it likely could run mostly fine. This means it should probably be a warning. The error messages also seems to imply this. I can do a little PR if the maintainers agree with this.

System Info Not applicable.

araffin commented 3 years ago

Hello,

Having this be an assert statement causes code to be unable to run, even though it likely could run mostly fine. This means it should probably be a warning. The error messages also seems to imply this. I can do a little PR if the maintainers agree with this.

if it runs, yes this should be converted to a warning, as we did for SB3: https://github.com/DLR-RM/stable-baselines3/pull/270

However, we are currently having trouble with Travis CI (no more open source credits, see https://github.com/hill-a/stable-baselines/pull/1081 and https://github.com/hill-a/stable-baselines/pull/1067) and we are currently having trouble contacting @hill-a (the owner of the repo which is the only one that can deactivate that requirement).

Anyway, as SB2 is in maintenance mode, I would recommend you to switch to SB3 if possible.

araffin commented 3 years ago

Fixed in SB3