Open muk465 opened 2 years ago
Hello,
I also have the same problem, but I have found thoses links. Maybe it could help you. https://github.com/hill-a/stable-baselines/issues/352 https://github.com/hill-a/stable-baselines/issues/776
I still didnt not try the proposed solution, because I need to figure out, how could I combine it (proposed solution), with calling my seed in my env. So everytime I start a new episode "i" , the algo pass the "i" in the seed of my env (seed(i)) so I can sample new reproducible values.
Anyway Good luck .
Question I am using a custom environment to to path planning using ddpg algorithm , (i am using stable baslines3)
model = DDPG("MlpPolicy", env, action_noise=action_noise, verbose=1) model.learn(total_timesteps=10000, log_interval=1) model.save("sb3_ddpg_model")
here model.learn is used for timestep based learning but i want to convert it to certain like 3000 steps per episode and have multiple episodes,how can i achieve that?