Stable-Baselines-Team / stable-baselines3-contrib

Contrib package for Stable-Baselines3 - Experimental reinforcement learning (RL) code
https://sb3-contrib.readthedocs.io
MIT License
465 stars 173 forks source link

Worse training with Vectorized Environment #210

Closed pklochowicz closed 8 months ago

pklochowicz commented 12 months ago

🐛 Bug

I'm training the RecurrentPPO model on custom environment. To speed up the training I used SubprocVecEnv with num_envs = 32. I noticed that it completely changed the performance of the training in comparison to DummyVecEnv with num_envs = 1 (in the case of 32 envs I've decreased the n_steps 32 times, so the batch size remained the same). Below is the plot with two training runs - pink is DummyVecEnv (1) and green is SubprocVecEnv (32). Do you know how to explain that huge change?

Thank you in advance,

image

Code example

No response

Relevant log output / Error message

No response

System Info

No response

Checklist