JuliaPOMDP / POMDPs.jl

MDPs and POMDPs in Julia - An interface for defining, solving, and simulating fully and partially observable Markov decision processes on discrete and continuous spaces.
http://juliapomdp.github.io/POMDPs.jl/latest/
Other
668 stars 104 forks source link

Introduce BeliefUpdater type #32

Closed ebalaban closed 9 years ago

ebalaban commented 9 years ago

I think it would be very useful to define a BeliefUpdater type. Here are the main benefits, as I see them:

We'd need to modify 'belief' function as follows:

belief(bu::BeliefUpdater, pomdp::POMDP, belief_old::Belief, action::Any, obs::Any, belief_new::Belief=create_belief(pomdp))

mykelk commented 9 years ago

I'm good with this if there is consensus.

zsunberg commented 9 years ago

I'll reply later today

On Sat, Sep 26, 2015 at 3:58 PM Mykel Kochenderfer notifications@github.com wrote:

I'm good with this if there is consensus.

— Reply to this email directly or view it on GitHub https://github.com/sisl/POMDPs.jl/issues/32#issuecomment-143502952.

zsunberg commented 9 years ago

After thinking about this for a while, I am against introducing this type. First, I will talk some about how the functionality of a BeliefUpdater can be implemented (elegantly) in the current interface, and then I will talk about the reasons that I think it should not be included.

A) BeliefUpdater functionality can be implemented using the current framework

First, it should be noted that any belief update behavior that could be specified by a BeliefUpdater could be specified by including the BeliefUpdater object in the Belief object. Thus, there is not a strict need for BeliefUpdater for the sake of being able to represent all possible belief update behaviors.

If I understand correctly, the argument for introducing BeliefUpdater is that it would be more convenient in some cases to reuse code by defining several BeliefUpdaters for one Belief type, and more julian to use multiple dispatch to determine behavior rather than examining the contents of the Belief object. Fortunately, julia's parametric type system allows us to easily do this without using two objects. I believe the examples that @ebalaban used involved particle filters, but I think Gaussian filters for nonlinear systems might offer a shorter example. Suppose that we have a single Gaussian belief representation, but we want to have an Unscented Kalman Filter (UKF) update and an Extended Kalman Filter (EKF) update. This can be implemented in the following way:

abstract Belief

abstract GaussianBeliefUpdater

type EKFUpdater <: GaussianBeliefUpdater
end

type UKFUpdater <: GaussianBeliefUpdater
end

type GaussianBelief{T<:GaussianBeliefUpdater}
end

function belief(b::GaussianBelief{EKFUpdater})
    println("EKF update")
end

function belief(b::GaussianBelief{UKFUpdater})
    println("UKF update")
end

ekf_belief = GaussianBelief{EKFUpdater}()
ukf_belief = GaussianBelief{UKFUpdater}()

belief(ekf_belief)
belief(ukf_belief)

The belief function here represents the belief function in POMDPs.jl. Both GaussianBelief{EKFUpdater} and GaussianBelief{UKFUpdater} are subtypes of Belief, but they behave differently when belief is called. The first time belief is called, an ekf update occurs, the second time a ukf update occurs.

If there are any other cases that cannot be addressed by this method or something similar, please post a comment, and I think we can come up with a way to do it elegantly without requiring BeliefUpdater.

B) BeliefUpdater will make developing and using POMDPs.jl more difficult and less likely to be widely adopted

I think BeliefUpdater will make life more difficult for us primarily for the following 4 reasons:

1. There is one more concept to understand

When someone is deciding whether to use POMDPs.jl, they will look at the interface and, the quicker they understand it, the more likely they will be to adopt it. BeliefUpdater is likely to distract from the more fundamental concepts we are trying to communicate.

2. The list of objects that define a problem will be larger

Every time someone writes a new problem (POMDP subtype), they will have to define a BeliefUpdater. Moreover, there will be many places where types will need to have an extra belief updater field, or functions will need to have belief updater arguments. This is unnecessary for simple problems, and will raise the barrier to entry.

3. create_belief and specialize_belief will be much more complicated

If we want to automatically determine the belief type based on the pomdp as we do when we call create_belief, we will also need to define a function to determine the BeliefUpdater for that Belief. Furthermore, if we adopt the specialize_belief function in #32, the specialized belief will also have an associated BeliefUpdater that will need to be determined.

4. Belief and BeliefUpdater types will tend to be highly coupled

One of the main reasons that I see for introducing a new abstract type in POMDPs.jl is that it defines a new component that can be reused with a wide range of other components, i.e. most POMDPs should be able to be used with most Solvers, which should be able to be used with most Simulators. This will probably not be the case with BeliefUpdaters. It seems to me that each BeliefUpdater would only be able to be used with one, or at most a few Belief types. Since there is such a strong coupling here, I think a better way to organize this is with parametric types or another similar approach.

I hope this all makes sense, I really want this interface to be as good as possible :)

mykelk commented 9 years ago

Thanks, guys, for taking the time to really figure this out. Zach, would it be possible to discuss at 11:30am on Monday?

ebalaban commented 9 years ago

@zsunberg, thank you for the detailed response on this. I certainly share your desire to make this API as good as possible :-) I think where we diverge on this is that I consider belief update to be an operation in roughly the same category of importance as solving the POMDP in the first place and you, perhaps, see it being in more of a supporting capacity to the solver (please correct me if I am wrong).

We could, of course, as you suggest, encapsulate belief updater parameters in Belief structures, create wrappers for beliefs that make them look different (even though, deep inside, they are all the same) and dispatch the appropriate belief() methods based on that. So, why not do the same for solvers too then? We'd create different wrappers for our input POMDP (e.g. POMCP_POMDP, DESPOT_POMDP, MCVI_POMDP, etc). We'd then hide configuration parameters for the solver inside the POMDP structure and then call solve(). I have to confess that before really understanding the general philosophy that POMDPs.jl was aiming for, this is kind of what I implemented in the first cut of DESPOT. I do, however, now like that we keep the POMDP structure as clean and generic as possible. In the same manner, I would also like to keep Belief implementations as clean as possible and only containing the distribution itself. As you may have deduced from our previous discussions, my preference is to resort to type wrapping only in the most dire circumstances. I think type wrapping makes it harder to understand what is going on, rather than easier.

Ideally, belief updating would be independent of the problem type and of the solver type. Thus any problem can be used with any solver and with any belief updater that supports the belief representation type that we want to use in the solver. You are right that a belief updater will be tied to one (or, at most, few) belief types - but this is what I think we want! In the near term, belief updaters will probably be supplied by solver writers to work with their preferred belief representations. In the future, however, I'd like to see a library of them that can work on the popular belief representation types using a variety of methods (e.g., UKF and EKF for Gaussian belief, as you mentioned).

So, I remain convinced that endowing belief updaters with the same rights and responsibilities as solvers is the right way to go :-) Perhaps others will chime in and help us break our stalemate :-)

mykelk commented 9 years ago

I think I'm being persuaded by Edward. Very fundamental to POMDPs is the notion of belief updating. There are probably more algorithms for belief updating (e.g., in the fields of state estimation, tracking, and filtering) than there are for doing the actual planning. Of course, we want to stay focused on the planning aspect and not try to be a comprehensive state estimation package.

zsunberg commented 9 years ago

Just out of curiosity, will introducing a separate BeliefUpdater type help you solve the problem of DESPOT being tied to a specific belief representation/update method? I don't believe it can help with POMCP since the update simulations are carried out as part of the decision-making process. I am beginning to think the POMCP and DESPOT situations are not as similar as I thought.

zsunberg commented 9 years ago

I think the strongest argument for this is that separation of state and dynamics information is good. I.e. in the <POMDP, State> domain, the POMDP contains the dynamics information and the State contains the state information, in the <Policy, PolicyState> domain, the policy contains the dynamics information and the PolicyState contains the state informations, so, if we want beliefs to be analogous, as you have been saying all along, @ebalaban, we should have <BeliefUpdater, Belief> instead of just Belief.

I will feel a lot less uneasy about this if we adopt the changes suggested in #34 because then it would be clear that the Policy should carry a BeliefUpdater (or the Policy will handle the belief updates itself if that makes sense for the policy). Before #34 it wasn't clear to me where exactly the BeliefUpdater would live in the simulation and who would be responsible for it.

We may want to consider breaking the belief updates interface out into a separate package though to keep POMDPs more concise.

mykelk commented 9 years ago

I'm okay with belief update stuff either being in POMDPs.jl or in a separate package.

The Policy would keep track of the BeliefUpdater.

zsunberg commented 9 years ago

OK - what about this:

Either

abstract BeliefUpdater
abstract POMDP <: BeliefUpdater
abstract Policy <: BeliefUpdater

function update(::BeliefUpdater, ::Belief, ::Action, ::Observation)

or, if we can't stomach something that radical, simply

function update(::Union(BeliefUpdater, Policy, POMDP), ::Belief, ::Action, ::Observation)

(with the extra preallocation arguments of course).

This reflects the fact that a Policy may provide the belief update if the belief is a policy-specific structure, and a POMDP may provide exact belief updates or domain-specific approximate belief updates.

I think this solves all of our problems:

  1. @ebalaban can define BeliefUpdaters that are not Policys or POMDPs to his heart's content
  2. @zsunberg doesn't have to carry around an extra BeliefUpdater wherever beliefs are used
  3. This provides a way for problem designers to specify exact belief updates

I think that in the outer loop simulate function, the policy should always be the first argument to update for consistency's sake, but specifying an additional BeliefUpdater argument in simulate would also be reasonable. If the policy is always used, that policy may contain a BeliefUpdater for internal use if the solver-writer wants to provide that flexibility.

mykelk commented 9 years ago

Okay. I'd rather have four separate functions instead of one with a Union. I really want to avoid using Unions.

zsunberg commented 9 years ago

@mykelk, is there a reason you said "four" instead of "three"? I only see three being created. Am I missing something?

Also, is there a "Union considered harmful" somewhere out there for me to read? To me using a union seems like a more concise way of doing exactly the same thing as declaring the three functions (this is of miniscule importance - not worth arguing - I'm just curious).

zsunberg commented 9 years ago

@ebalaban should we close this and move further discussion to #35 ?

ebalaban commented 9 years ago

Sounds good.