facebookresearch / habitat-lab

A modular high-level library to train embodied AI agents across a variety of tasks and environments.
https://aihabitat.org/
MIT License
1.93k stars 483 forks source link

NavRLEnv: question about 'success' and 'get_action_shortest_path' #726

Closed sparisi closed 2 years ago

sparisi commented 3 years ago

Habitat-Lab and Habitat-Sim versions

Habitat-Lab: master Habitat-Sim: master

I have a navigation task, and in the YAML I have SUCCESS as MEASUREMENTS. SUCCESS_REWARD is 2.5 and SUCCESS_DISTANCE is 0.2. I also set the goal radius to 0.2 in the JSON file. I run get_action_shortest_path with success_distance=env.unwrapped._core_env_config.TASK.SUCCESS_DISTANCE (0.2) to find the optimal trajectory (the function finds a 51-step trajectory), and then I run the actions to see what the optimal rewards will be. Image rendering looks fine, and the distance between the agent and the goal decreases. However, at the end info['success'] is still 0, the reward small (the highest it gets is mid-trajectory and it is 0.2x), and the agent is still far from the goal (0.36, that is larger than 0.2). Also, the done flag is False (I expect that when the agent reaches the goal it returns True). Since the path is found with get_action_shortest_path, shouldn't the agent get closer to the goal (<0.2), get higher reward (close to 2.5), and info['success'] be 1 at the last step of the path?

The scene is apartment_0 from Replica.

Example of a trajectory:

EDIT

The 2nd component of the position seems wrong, so I changed it to what I get from sample_navigable_point, which is -1.3747652. Now the agent goes closer than 0.2 to the goal, but success is still 0 and done is False.

sparisi commented 3 years ago

The reason why 'success' is always 0 and the environment doesn't return 'success_reward' at goal is that this happens only if the action 'STOP' is passed. This action is not returned by get_action_shortest_path. For my purpose, I do not use 'STOP' actions, and I'd like that the episode ends with 'success' when the agent is close to the goal. Is there a flag that I can pass (maybe in the YAML?) to achieve that? Or a way to override the Success, or to create and pass a custom terminal condition? Usually in RL the episode automatically ends when the agent reaches the goal.

erikwijmans commented 3 years ago

No, we currently don't support not having a stop action given its importance in EmbodiedAI tasks (a task like ObjectNav would be much less meaningful without a stop action for instance).

Skylion007 commented 3 years ago

You could, however, have an oracle that calls STOP when a condition is reached though. This usually would require hard coding this into the agent with privileged information.

sparisi commented 3 years ago

You could, however, have an oracle that calls STOP when a condition is reached though. This usually would require hard coding this into the agent with privileged information.

I think it would be easier to check for terminal condition without 'STOP' and then call reset. Anyway yes, I would have to hard code SPL success condition into my wrapper. For now I simply have commented the check for 'STOP' in the source code and it works.

No, we currently don't support not having a stop action given its importance in EmbodiedAI tasks (a task like ObjectNav would be much less meaningful without a stop action for instance).

It would still be useful to simply have a flag to remove the need for is_stop_called in Success (something like is_stop_required). Just giving a suggestion :)

rpartsey commented 2 years ago

@sparisi Feel free to re-open the issue, if you still have questions.