Open tldoan opened 6 years ago
Hi tlss94,
You are right, the function "update_init_selector" shouldn't be used there. This bug does not affect the default execution of the evaluation: when calling test_and_plot_policy
, this calls test_policy
, which has as default argument parallel=True
, therefore returning with test_policy_parallel
instead of executing any of the code you were pointing at. You can see that the parallel evaluation uses evaluate_states
: https://github.com/florensacc/rllab-curriculum/blob/master/curriculum/envs/maze/maze_evaluate.py#L312
and this in turn ends up using env.update_start_generator(FixedStateGenerator(state))
, which is valid. https://github.com/florensacc/rllab-curriculum/blob/master/curriculum/state/evaluator.py#L274
To reply to your question, you can see in the first link I paste here that the states that are evaluated come from the functions tile_space
or find_empty_spaces
. Is this what you were looking for? If you wish to have some other custom set of stats tested you can modify that part of the code.
Hi,
I understand that for evaluation phase for the AntMaze you just let the Ant start from init_pos https://github.com/florensacc/rllab-curriculum/blob/master/curriculum/experiments/starts/maze/maze_ant/maze_ant_brownian_algo.py#L140-L155
But for PointMass is it possible to have a list of init_pos too? I tried to understand your code :
apparently you are using test_policy methods https://github.com/florensacc/rllab-curriculum/blob/a1eb78b31b992f753eb66316246d24b78bc36962/curriculum/experiments/starts/maze/maze_brownian_algo.py#L157
But I couldn t find the method : update_init_selector that apparently modifies the initial states when resetting the env?
https://github.com/florensacc/rllab-curriculum/blob/81a3714eabfb93a6aa96c885d63b4a3602c0f72c/curriculum/envs/maze/maze_evaluate.py#L219 https://github.com/florensacc/rllab-curriculum/search?q=update_init_selector&unscoped_q=update_init_selector
Thank you very much.