h2r / pomdp-py

A framework to build and solve POMDP problems. Documentation: https://h2r.github.io/pomdp-py/
MIT License
216 stars 50 forks source link

Multi object search history and tree construction #41

Closed RajeshDM closed 5 months ago

RajeshDM commented 9 months ago

Hi. I was experimenting with the multi-object search and came across this function clear_history() (https://github.com/h2r/pomdp-py/blob/85d9a3182e6a883e44315617c9fdfde484d05c01/pomdp_problems/multi_object_search/agent/agent.py#L59) as part of the agent which is called here : https://github.com/h2r/pomdp-py/blob/85d9a3182e6a883e44315617c9fdfde484d05c01/pomdp_problems/multi_object_search/problem.py#L236

But when I run the code, it looks like the history is not being deleted but being expanded upon. I assume the expected behavior is once the real action is executed, the previous history is deleted and only the last action and observation are stored (in the very next line).

In the case of the above assumed behaviour - (that only 1 item will be stored in history), there might be a potential issue at :https://github.com/h2r/pomdp-py/blob/85d9a3182e6a883e44315617c9fdfde484d05c01/pomdp_problems/multi_object_search/models/policy_model.py#L31 as it is checking for history being larger than 1.

Separate issue - is it possible the tree construction in PO_UCT also needs to be reset every time a real action is taking? I'm seeing a weird behavior when the next suggested real action is an action that should not be possible and the only way it could potentially happen is if the tree is not being cleared fully. I have been unable to debug this well as it's in .pyx code and I'm still seeing how to debug that.

zkytony commented 9 months ago

But when I run the code, it looks like the history is not being deleted but being expanded upon.

I think you caught a bug. _history is a field in the Cython Agent class, and can't be overridden on the python side. So clear_history in MOS Agent simply won't do its job. The correct way is to add set_history function in the Cython Agent class, similar to set_belief. I guess I was never really bothered by a growing history.

is it possible the tree construction in PO_UCT also needs to be reset every time a real action is taking?

Tree update should happen upon planner.update. If you have called that function, then you shouldn't experience the issue you're describing. Regarding debugging, check out the tree debugger.

RajeshDM commented 9 months ago

But when I run the code, it looks like the history is not being deleted but being expanded upon.

I think you caught a bug. _history is a field in the Cython Agent class, and can't be overridden on the python side. So clear_history in MOS Agent simply won't do its job. The correct way is to add set_history function in the Cython Agent class, similar to set_belief. I guess I was never really bothered by a growing history.

is it possible the tree construction in PO_UCT also needs to be reset every time a real action is taking?

Tree update should happen upon planner.update. If you have called that function, then you shouldn't experience the issue you're describing. Regarding debugging, check out the tree debugger.

yep, that history needs to be reset in the Cython code. What do you think about the consequence of making this update to the behavior of get all actions function : [pomdp-py/pomdp_problems/multi_object_search/models/policy_model.py](https://github.com/h2r/pomdp-py/blob/85d9a3182e6a883e44315617c9fdfde484d05c01/pomdp_problems/multi_object_search/models/policy_model.py#L31 - in this case the length of history will never be greater than 1, so I believe that should be changed to len(history) > 0

I will check out the update and the debugger thank you.

zkytony commented 9 months ago

The history does get accumulated during planning. get_all_actions is called by self._agent.valid_actions by default. You can try len(history) > 0. I believe that means the agent can do Look, Find. Currently it has to do Move, Look, Find. I think it's fine either way.

Regarding fixing clear_history (I believe this is what you mean by "this update"), I don't think there is noticeable difference before/after the fix.