h2r / pomdp-py

A framework to build and solve POMDP problems. Documentation: https://h2r.github.io/pomdp-py/
MIT License
224 stars 51 forks source link

Pomcp doesn't work in MOS #71

Open khuechuong opened 5 months ago

khuechuong commented 5 months ago

So I wanted to run the multi_object_search with pomcp. I changed sensor from proximity to laser.

problem = MosOOPOMDP(
        robot_char,  # r is the robot character
        sigma=0.05,  # observation model parameter
        epsilon=0.95,  # observation model parameter
        grid_map=grid_map,
        sensors={robot_char: laserstr},
        prior="uniform",
        agent_has_map=True,
    )

and belief_rep to particles

def __init__(
        self,
        robot_id,
        env=None,
        grid_map=None,
        sensors=None,
        sigma=0.01,
        epsilon=1,
        belief_rep="particles",
        prior={},
        num_particles=100,
        agent_has_map=False,
    ):

at first it gave me this error:

File "/home/pomdp-py/pomdp_py/problems/multi_object_search/agent/belief.py", line 74, in initialize_belief
    return _initialize_particles_belief(
TypeError: _initialize_particles_belief() missing 1 required positional argument: 'robot_orientations'

since robot_orientations isn't used in the function, I just set it to None. Then it gave me this:

  File "/home/pomdp-py/pomdp_py/problems/multi_object_search/agent/belief.py", line 139, in _initialize_particles_belief
    init_robot_pose = list(prior[robot_id].keys())[0]
AttributeError: 'int' object has no attribute 'keys'

Now I'm not exactly sure what to do. From what I saw, when initalize_histogram_belief did it, the output of prior is {-114: {(7, 9, 0): 1.0}} while in initialize_particles_belief, the output of prior is {-114: 0}.

zkytony commented 5 months ago

Hmmm, this may be a bug. I am not exactly sure what's happening here (it's been a while ...)

khuechuong commented 5 months ago

I've been trying to debug it, but I'm still new to pomdp_py and the MOS concept. I notice that in multi_object_search/agent/belief.py, for particle, it didn't input prior to particles.

if representation == "histogram":
        return _initialize_histogram_belief(
            dim, robot_id, object_ids, prior, robot_orientations
        )
    elif representation == "particles":
        return _initialize_particles_belief(
            dim, robot_id, object_ids, robot_orientations, num_particles=num_particles
        )

so I'm not sure where the priors are coming from, but it shouldn't just be 0. I tried putting prior in but it just leads to another problem.

zkytony commented 5 months ago

Oh, nice catch. I do notice the way _initialize_particles_belief is used is different from how it's declared. Code reef: https://github.com/h2r/pomdp-py/blob/main/pomdp_py/problems/multi_object_search/agent/belief.py#L122

I think I haven't run MOS with particle belief in a long while, and almost always used histogram + POUCT. That's why this was not fixed. Should not be difficult to fix though - feel free to give it a try!

Also, why use POMCP / particles? It's not going to work well because of particle depletion.

khuechuong commented 5 months ago

Well because in the MOS 2019 paper it said to use POMCP. Is POUCT better in this situation? Also, could you elaborate on particle depletion?

zkytony commented 5 months ago

You should read the paper more closely. OO-POMCP actually uses a histogram belief. Regarding particle deprivation, check out particle filters.

In short, if you’d like to try the algorithm from that paper, the closest approximation in this library is the POUCT + histogram planner in MOS.

khuechuong commented 5 months ago

When I ran it with POUCT histogram, its reward is 3923 so it only found 4 targets even though there are 5 set targets. That's why I thought POMCP would make it better. POMCP is POUCT + particle belief representation. I only did particle filters because it said that's how you can run POMCP.
Another thing I notice is in multi_object_search/env/env.py when I set occlusion to True in make_laser_sensor(90, (1, 4), 0.5, True) it gave me error.

zkytony commented 5 months ago

It's not perfect :) do you expect robots to be able to find objects 100% of the time?

Yes, POMCP expects particle filters. POUCT can work with any belief representation.

Occlusion is implemented here in the Laser2DSensor in components/sensor.py. I believe it was working at some point. You could try installing an older version of pomdp-py, like pip install pomdp-py==1.2.1. Note that occlusion implemented here is not ideal. See note here: https://github.com/h2r/pomdp-py/blob/main/pomdp_py/problems/multi_object_search/models/components/sensor.py.

Besides, check out this repo which implements a different style of occlusion by walls (not perfect still, but is more appropriate than the grid-based method here):

img

zkytony commented 5 months ago

Just a reminder that you might have to dig through some debugging to get it working... I haven't run it in years. I don't have the time now to test / fix things. Feel free to fork and build on these repos.

You should post the error message. I'll see if I can help. In any case this is good information for others.

khuechuong commented 5 months ago

Thank you for the detailed response. I have a few more questions:

zkytony commented 5 months ago

Those are good questions. Thanks @khuechuong.

I'm not representing the authors, but I personally think the naming of OO-POMCP is unfortunate. It should be "OO-POUCT", since the paper used histogram beliefs. I had the same questions years ago when I started.

In short: When you run MOS in pomdp_py with a histogram belief (for the object beliefs in the agent's OOBelief), you are running "OO-POMCP" -- so the code is here. It is not the original code behind that paper (which was in Java), but it should capture the same ideas. Also, the original paper used a room-based representation which isn't in pomdp-py's MOS domain, but that's not the main point.

khuechuong commented 5 months ago

Okie. OO-POUCT ;) Also for the transition state probability, I notice that of them are fixed numbers. Is that simulation only or also in real life? Is this 1 version of the ICRA 2019 code: https://github.com/awandzel/multiobject_search_oopomdp

zkytony commented 5 months ago

The transition is deterministic. It’s ok for modeling a rather high-level decision making layer for this problem. I believe so.

khuechuong commented 5 months ago

regarding the POUCT. It seems that to be able to use the POUCT, it requires me to give it the number of objects. Is there a way to formulate it without giving the the number of targets? A way I could think of is maybe up the number targets so agent can find max number of target it could, but I feel that's a bandage solution.

zkytony commented 5 months ago

POUCT is independent of the domain. I am not sure what you are talking about. Where do you specify the number of objects?

And yes of course, that’s possible. But you would need to change state representation to be e.g. coverage; this requires your own implementation . Or, you could set the number of targets to be 1, or some other fixed number, and keep re-creating/running them pomdp agent until times up.

khuechuong commented 5 months ago

So basically in MOS, the belief is always set with size of number of target + robot. So if there are 5 targets, the size of agent.cur_belief.object_beliefs is 6. So basically the agent knows how many objects are available. I was just wondering an elegant way of object search without knowing how many objects are in the environment.

Also, regarding the histogram belief. So the size of each target belief is all cells of environment. How would that work in the real world because of limited space?

zkytony commented 5 months ago

Mmm, check out the GM-PHD filter for multi-target tracking that can deal with time-varying number of targets. Also check out this work regarding searching and tracking of unknown number of targets: https://yoonchangsung.com/pub/gm-phd-tase-2021.pdf But to me, practically, being able to apply a simple POMDP (single/fixed-number of targets) in the unknown case seems elegant as we didn't need to come up with something more complicated.

Regarding the second question, check out the gif above. You could come up with a belief scheme over known locations and frontiers. This is also a worthwhile research direction.

khuechuong commented 4 months ago

I noticed that implemented it in ROS also. I see that u have the histogram belief over all costmap cells by decomposing workspace into a 20x20 gridmap. Would it be a good idea to whenever u detect a target, rather than having the histogram belief other all gridmap, just the area around the detected target with a certain range?

zkytony commented 4 months ago

Yes that's a valid idea. 3D-MOS updates belief only within the field of view but ensures normalization.