Closed sph001 closed 3 years ago
Probably a duplicate of https://github.com/hill-a/stable-baselines/issues/30#issuecomment-423694592
Anyway, I would highly recommend you to use the RL zoo (cf. doc) to avoid such issue and use TD3, SAC or TQC, which usually perform better than DDPG.
🐛 Bug
If I train a model with DDPG + HER (200,000 steps) and evaluate it over 1000 iterations, I receive a success rate of ~95%. If I then save that model, load it into a fresh instance of that same model, and run the exact same evaluation, it has a 0% success rate.
I cant find anything in my environment which might interfere with the model, but the loaded model's replay buffer is empty which suggests that the replay buffer is not being saved with the model.
To Reproduce
Expected behavior
I expect that if I save a model, I can call load and restore the model in the exact state it was in when I saved it.
System Info
python 3.7 SB3 1.1.0 torch 1.8.1+cu111 gym 0.18.3