Worse training with Vectorized Environment

🐛 Bug

I'm training the RecurrentPPO model on custom environment. To speed up the training I used SubprocVecEnv with num_envs = 32. I noticed that it completely changed the performance of the training in comparison to DummyVecEnv with num_envs = 1 (in the case of 32 envs I've decreased the n_steps 32 times, so the batch size remained the same). Below is the plot with two training runs - pink is DummyVecEnv (1) and green is SubprocVecEnv (32). Do you know how to explain that huge change?

Thank you in advance,

Code example

No response

Relevant log output / Error message

No response

System Info

No response

Checklist

[X] I have checked that there is no similar issue in the repo
[x] I have read the documentation
[X] I have checked my env using the env checker

Stable-Baselines-Team / stable-baselines3-contrib