Open huangjiancong1 opened 6 years ago
Use DDPG+HER algorithm to training config setting:
Logging to /tmp/openai-2018-08-06-21-12-50-218152
T: 50
_Q_lr: 0.001
_action_l2: 1.0
_batch_size: 256
_buffer_size: 1000000
_clip_obs: 200.0
_hidden: 256
_layers: 3
_max_u: 1.0
_network_class: baselines.her.actor_critic:ActorCritic
_norm_clip: 5
_norm_eps: 0.01
_pi_lr: 0.001
_polyak: 0.95
_relative_goals: False
_scope: ddpg
ddpg_params: {'buffer_size': 1000000, 'hidden': 256, 'layers': 3, 'network_class': 'baselines.her.actor_critic:ActorCritic', 'polyak': 0.95, 'batch_size': 256, 'Q_lr': 0.001, 'pi_lr': 0.001, 'norm_eps': 0.01, 'norm_clip': 5, 'max_u': 1.0, 'action_l2': 1.0, 'clip_obs': 200.0, 'scope': 'ddpg', 'relative_goals': False}
env_name: FetchSlide-v1
gamma: 0.98
make_env: <function prepare_params.<locals>.make_env at 0x7fd9807e9bf8>
n_batches: 40
n_cycles: 50
n_test_rollouts: 10
noise_eps: 0.2
random_eps: 0.3
replay_k: 4
replay_strategy: future
rollout_batch_size: 2
test_with_polyak: False
Creating a DDPG agent with action space 4 x 1.0...
Training...
-----------------------------------
| epoch | 0 |
| stats_g/mean | 0.5070929 |
| stats_g/std | 0.46622786 |
| stats_o/mean | 0.11331562 |
| stats_o/std | 0.22906026 |
| test/episode | 20.0 |
| test/mean_Q | -2.963306 |
| test/success_rate | 0.0 |
| train/episode | 100.0 |
| train/success_rate | 0.0 |
Unrelated question, but which mujoco and gym versions are you using?
How to improve the success rate, my goal is to use BAXTER robot to push the object to the target point in MUJOCO, my GYM environment has been completed, but his training success rate has been very low(0.0~0.1). And trained 50 times success rate and trained 200 times second success rate not far away.