florensacc / rllab-curriculum

Other
130 stars 43 forks source link

Evaluation point for PointMass environment #12

Open tldoan opened 6 years ago

tldoan commented 6 years ago

Hi,

I understand that for evaluation phase for the AntMaze you just let the Ant start from init_pos https://github.com/florensacc/rllab-curriculum/blob/master/curriculum/experiments/starts/maze/maze_ant/maze_ant_brownian_algo.py#L140-L155

But for PointMass is it possible to have a list of init_pos too? I tried to understand your code :
apparently you are using test_policy methods https://github.com/florensacc/rllab-curriculum/blob/a1eb78b31b992f753eb66316246d24b78bc36962/curriculum/experiments/starts/maze/maze_brownian_algo.py#L157

But I couldn t find the method : update_init_selector that apparently modifies the initial states when resetting the env?

https://github.com/florensacc/rllab-curriculum/blob/81a3714eabfb93a6aa96c885d63b4a3602c0f72c/curriculum/envs/maze/maze_evaluate.py#L219 https://github.com/florensacc/rllab-curriculum/search?q=update_init_selector&unscoped_q=update_init_selector

Thank you very much.

florensacc commented 6 years ago

Hi tlss94,

You are right, the function "update_init_selector" shouldn't be used there. This bug does not affect the default execution of the evaluation: when calling test_and_plot_policy, this calls test_policy, which has as default argument parallel=True, therefore returning with test_policy_parallel instead of executing any of the code you were pointing at. You can see that the parallel evaluation uses evaluate_states: https://github.com/florensacc/rllab-curriculum/blob/master/curriculum/envs/maze/maze_evaluate.py#L312

and this in turn ends up using env.update_start_generator(FixedStateGenerator(state)), which is valid. https://github.com/florensacc/rllab-curriculum/blob/master/curriculum/state/evaluator.py#L274

To reply to your question, you can see in the first link I paste here that the states that are evaluated come from the functions tile_space or find_empty_spaces. Is this what you were looking for? If you wish to have some other custom set of stats tested you can modify that part of the code.