Closed AvalonGuo closed 8 months ago
model = DDPG( "MultiInputPolicy", env, learning_rate=0.001, buffer_size=100000, replay_buffer_class=HerReplayBuffer, tensorboard_log=log_dir, tau=0.05, gamma=0.95, verbose=1 )
model.learn( total_timesteps=1.6e6, progress_bar=True)
model.save("ddpg_franka")
i found the train couldn't stop using the default setting
What do you mean exactly by that?
Sry, I didn't express myself clearly enough.The situation is shown in the above figure,when using DDPG to train.The code is same except for the total_timesteps.
Using the following code and latest version of gymnasium, SB3, panda gym, I cannot reproduce the issue:
import panda_gym # noqa: F401
from stable_baselines3 import DDPG, HerReplayBuffer
model = DDPG(
"MultiInputPolicy",
"PandaPickAndPlace-v3",
learning_rate=0.001,
buffer_size=1000,
replay_buffer_class=HerReplayBuffer,
tau=0.05,
gamma=0.95,
verbose=1,
learning_starts=100,
policy_kwargs=dict(net_arch=[64])
)
model.learn(total_timesteps=1000, progress_bar=True)
model.env.close()
🐛 Bug
when tranning panda_gym's PickAndPlace env using DDPG,i found the train couldn't stop using the default setting.
To Reproduce
System Info
Checklist