Closed zinan93 closed 6 years ago
Hello,
It seems that we improved "discretized_progress" through this commit, but we did not update the notebook correspondingly. I just updated the notebook, so now you have the same result as we have, which is that "discretized_progress" seems to do better than "InterestTree" on this task with the default parameters tested.
ah I see. Could you briefly sketch the intuition as why in this particular experiment the updated discretized_progress is better? The major difference I see here is that rather than using the covariance of the time and learning error, difference of means of the windows is taken. Why is this better than splitting in a tree structure?
One difference in the implementation between discretized_progress and InterestTree is the distance error and thus the progress computation in a cell. In discretized_progress, the distance error is capped by the cell size, which means that in a far unreachable cell, the progress is strictly zero as the distance from a goal in that cell to a reached point out of the cell is always capped to the maximum cell size. This implies that those unreachable cells are less often sampled to choose a new goal and cells with a positive progress are more often chosen, however this is not implemented in InterestTree.
Another thing to take into account is that in theory, InterestTree gets better than discretized_progress when there are a lot of unreachable cells, which is not quite the case in this example, but would be in a higher dimensional sensory space (e.g. >= 3D) with small cells. The InterestTree model is supposed to handle the high-dimendional spaces through growing a tree that is consistent with the task: with more cells at the boundaries of the task, and thus better estimates the progress in each cell and the variation of progress across the sensory space.
I followed the tutorial here. However when comparing the three goal sampling methods, tree seems to be worse than discretized_progress in the evaluation plots:
by looking at the sampled goals and reached points:
it seems like tree_goal and discretized_progress_goal are reversed, so is the heat plot:
I tried to run the experiments multiple times using the exact same code from the tutorial, but most of the times (8 out of 10) the alikes of the above results are produced. Any idea what went wrong here? Thanks!