This is an issue to track the comparison between the baselines for MultiagentPointEnv.
Without parameter tuning and for MultiagentPointEnv(d=1, k=6)
ActionDependentGaussianMLPBaseline appears to converge about 2x faster than GaussianMLPBaseline. Note: no tuning has been done whatsoever at this point. This is 26 runs so far across 3 step sizes (random seed 1 (so far)).
This is an issue to track the comparison between the baselines for
MultiagentPointEnv
.Without parameter tuning and for
MultiagentPointEnv(d=1, k=6)
ActionDependentGaussianMLPBaseline
appears to converge about 2x faster thanGaussianMLPBaseline
. Note: no tuning has been done whatsoever at this point. This is 26 runs so far across 3 step sizes (random seed 1 (so far)).Running with commit 4e11e7ff29e.