I am trying to recreate the figure 6 in the paper which show the effect of all the known techniques on performance for coinrun. Need help with the following if possible.
starting new training just clear the directory and delete all previous logs!!
the train with --test seems not to produce enough information on tensorboard. rewmean is the mean reward for testing on random levels I suppose. training mean reward results are missing?
resuming the training starts the episodes and total timestamps from 0 shouldn't it start from the previous numbers. (i can see the model is loaded and getting improved). its ends up with wrong graphs in the tensorboard.
what changes come into mind if I try to apply DQN here?
I am trying to recreate the figure 6 in the paper which show the effect of all the known techniques on performance for coinrun. Need help with the following if possible.
starting new training just clear the directory and delete all previous logs!!
the train with --test seems not to produce enough information on tensorboard. rewmean is the mean reward for testing on random levels I suppose. training mean reward results are missing?
resuming the training starts the episodes and total timestamps from 0 shouldn't it start from the previous numbers. (i can see the model is loaded and getting improved). its ends up with wrong graphs in the tensorboard.
what changes come into mind if I try to apply DQN here?
Thanks in advance