Open cathywu opened 7 years ago
Seems that MultiagentSharedEnv, even without collisions is quite difficult of an environment.
Test-time rollout for k=6: video (Summary: in 1000 steps, 5 of 6 agents reach the target, although they get close within a few steps.)
MultiagentSharedEnv, done_epsilon=0.01, batch_size=10000, k=50 (WHY does (exponential) discount factor of 0.5 cause such regression?):
Next I will try binary and linear discounting instead of exponential.
Results:
Pending:
MultiagentSharedEnv, done_epsilon=0.01, batch_size=10000, collision_penalty=50, k=25 (first positive result on spatial discounting!):
MultiagentSharedEnv, done_epsilon=0.01, batch_size=10000, collision_penalty=50, k=50:
Note: local run, 1 seed
Results:
Tracking the status.