Closed seuqaj114 closed 5 years ago
I tried changing the line
config.sim_summaries[name] = _define_simulation(task, config, params, horizon, 1)
to
config.sim_summaries[name] = _define_simulation(task, config, params, horizon, 10)
in scripts.configs.py
and the test scores go down significantly, but they remain low throughout training. I would guess this is because I'm averaging over multiple episodes, but looking at the tensorboard simualtion gif, the agents are actually not performing correctly in this case.
I tried setting log_every=1
instead of log_every=config.test_steps
in line 175 of training/utility.py
, but that didn't result in lower scores. However, in this case the test results are higher than training (almost max performance at the first test evaluation, whereas training is still quite low).
I'm struggling to make sense of this
Those are good questions I'd like to know answers for too. Maybe look at this issue: https://github.com/google-research/planet/issues/10#issuecomment-493103249. I've discussed a bit the topic with @danijar.
Thanks for reaching out. The figures using the data shown on TensorBoard. They show the median and percentiles 5 to 95 over 5 random seeds and a window of 10 episodes. It should be easy to replicate this from the CSV you can download from TensorBoard. There is also a script called fetch_events
that can extract CSV directly from TensorFlow summary files.
Thanks for reaching out. The figures using the data shown on TensorBoard. They show the median and percentiles 5 to 95 over 5 random seeds and a window of 10 episodes. It should be easy to replicate this from the CSV you can download from TensorBoard. There is also a script called
fetch_events
that can extract CSV directly from TensorFlow summary files.
Hi Danijar:
I have one more questions about the "a window of 10 episodes". What does that mean ? My guess is, you use 5 different random seed. Each random seed run once. You get the mean from those 5 experiments ? Thanks a lot !
It's the mean and variance aggregated over both 5 seeds and 10 consecutive episodes for each of the seeds.
Hey, the models train very well out of the box with this code, but the tensorboard plots do not correspond to Figure 4 (as far as I can tell, they correspond to the evaluation of a single episode every 5000 steps).
I have a few specific questions:
Thanks a lot (I'm trying to use this model as comparative experiment in time for the NeurIPS deadline, so a reply within the next couple days would be greatly appreciated :) )