JuliaPOMDP / POMDPs.jl

MDPs and POMDPs in Julia - An interface for defining, solving, and simulating fully and partially observable Markov decision processes on discrete and continuous spaces.
http://juliapomdp.github.io/POMDPs.jl/latest/
Other
662 stars 100 forks source link

Belief Initalization #18

Closed etotheipluspi closed 9 years ago

etotheipluspi commented 9 years ago

Currently, there are no rules on how a belief should be initialized. A belief is attached to a pomdp, so having a create function for it seems to make sense. There are two ways that I think might work here:

What does everyone think?

ebalaban commented 9 years ago

I was going to suggest the same thing. I have a function like that in DESPOT.jl and called it "initial_belief" for the time being.

On 8/21/2015 3:37 PM, Maxim Egorov wrote:

Currently, there are no rules on how a belief should be initialized. A belief is attached to a pomdp, so having a create function for it seems to make sense. There are two ways that I think might work here:

*

Have something like |create_belief(pomdp)| that returns an initial
belief instance.

*

Have |create_state_distribution(pomdp)| return a distribution that
can be used in |transition!()| and in |update_belief!|. This means
that both the belief and the transition distribution are the same.
This is reasonable because both are strictly distributions over
states.

What does everyone think?

— Reply to this email directly or view it on GitHub https://github.com/sisl/POMDPs.jl/issues/18.

ebalaban commented 9 years ago

Oh, my vote would be for the first option, so that those solvers that do not enumerate states could return belief in whatever form they represent it in (e.g. a set of particles).

On 8/21/2015 3:37 PM, Maxim Egorov wrote:

Currently, there are no rules on how a belief should be initialized. A belief is attached to a pomdp, so having a create function for it seems to make sense. There are two ways that I think might work here:

*

Have something like |create_belief(pomdp)| that returns an initial
belief instance.

*

Have |create_state_distribution(pomdp)| return a distribution that
can be used in |transition!()| and in |update_belief!|. This means
that both the belief and the transition distribution are the same.
This is reasonable because both are strictly distributions over
states.

What does everyone think?

— Reply to this email directly or view it on GitHub https://github.com/sisl/POMDPs.jl/issues/18.

etotheipluspi commented 9 years ago

Good point! Particle based beliefs might have a different implementation from the transition distribution. I will put my vote in for the first option as well.

zsunberg commented 9 years ago

If it is implemented, I think it should be called something other than create_belief because functions that begin with "create" are used elsewhere to allocate empty states/observations/distributions whose contents do not matter. The proposed function is something different. initial_belief would be a good name.

mykelk commented 9 years ago

I think you want both create_belief and initialize_belief!(b) that would initialize an existing object. Right?

zsunberg commented 9 years ago

I think we need to answer the conceptual question "Is the initial belief part of the problem definition? (or should it be an optional part of the problem definition?)"

Here are some arguments (that may or may not be strong) for both sides:

Alternatively, we could add a keyword argument for solve that takes an initial belief, and if no initial belief is supplied, the solver may complain "please specify an initial belief" or something. My initial reaction is that this is a better solution, but I am by no means committed to that.

[1] Kurniawati, H., Hsu, D., & Lee, W. (2008). SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces. Robotics: Science and Systems. Retrieved from https://www1.comp.nus.edu.sg/~leews/publications/rss08.pdf [2] Bai, H., Hsu, D., Kochenderfer, M. J., & Lee, W. S. (2011). Unmanned aircraft collision avoidance using continuous-state POMDPs. Proceedings of Robotics: Science and Systems VII. Retrieved from http://books.google.com/books?hl=en&lr=&id=Ziy81kH3KfUC&oi=fnd&pg=PA1&dq=Unmanned+Aircraft+Collision+Avoidance+using+Continuous-State+POMDPs&ots=ZYixuZ4h3Z&sig=DfRb6ZDLeZtiMFBrF62yH-W_DuU

mykelk commented 9 years ago

I'm happy with it being an optional argument to solve or as part of the problem definition.

ebalaban commented 9 years ago

In the purest, most general form a POMDP definition indeed does not need a b0. In practical terms, however, given a problem and an online solver, where would b0 come from if it is not part of the problem definition?

On 8/26/2015 1:27 PM, Mykel Kochenderfer wrote:

I'm happy with it being an optional argument to solve or as part of the problem definition.

— Reply to this email directly or view it on GitHub https://github.com/sisl/POMDPs.jl/issues/18#issuecomment-135160935.

zsunberg commented 9 years ago

If the solver is an online solver, the work of solving is done inside of action() rather than solve() (if there are some preliminary offline calculations, they can be done in solve()). action() takes the current belief as an argument - and that is the belief for which it should find an optimal action.

On Wed, Aug 26, 2015 at 2:25 PM ebalaban notifications@github.com wrote:

In the purest, most general form a POMDP definition indeed does not need a b0. In practical terms, however, given a problem and an online solver, where would b0 come from if it is not part of the problem definition?

On 8/26/2015 1:27 PM, Mykel Kochenderfer wrote:

I'm happy with it being an optional argument to solve or as part of the problem definition.

— Reply to this email directly or view it on GitHub https://github.com/sisl/POMDPs.jl/issues/18#issuecomment-135160935.

— Reply to this email directly or view it on GitHub https://github.com/sisl/POMDPs.jl/issues/18#issuecomment-135174461.

mykelk commented 9 years ago

Yep, Zach is right.

ebalaban commented 9 years ago

Yes, that's fine, but how does a user create the "current belief"? Let's say the user wants to try running RockSample(4,4) problem with POMCP. Belief representation will depend on the solver, belief contents will depend on the particular problem. What would the user get started with if we don't have a create_belief or an initial_belief function? Will they need to concoct an array of, say, particles, and fill in states and weights/probabilities?

On 8/26/2015 2:29 PM, zsunberg wrote:

If the solver is an online solver, the work of solving is done inside of action() rather than solve() (if there are some preliminary offline calculations, they can be done in solve()). action() takes the current belief as an argument - and that is the belief for which it should find an optimal action.

On Wed, Aug 26, 2015 at 2:25 PM ebalaban notifications@github.com wrote:

In the purest, most general form a POMDP definition indeed does not need a b0. In practical terms, however, given a problem and an online solver, where would b0 come from if it is not part of the problem definition?

On 8/26/2015 1:27 PM, Mykel Kochenderfer wrote:

I'm happy with it being an optional argument to solve or as part of the problem definition.

— Reply to this email directly or view it on GitHub https://github.com/sisl/POMDPs.jl/issues/18#issuecomment-135160935.

— Reply to this email directly or view it on GitHub https://github.com/sisl/POMDPs.jl/issues/18#issuecomment-135174461.

— Reply to this email directly or view it on GitHub https://github.com/sisl/POMDPs.jl/issues/18#issuecomment-135175473.

mykelk commented 9 years ago

I think some solvers are fundamentally tied to a particle-based representation (e.g., POMCP, DESPOT). These solvers would just keep track of their own belief representation internally (basically just an array of states). The root node can be sampled from the user-specified belief distribution. Does that make sense?

zsunberg commented 9 years ago

For POMCP, I have defined a belief called POMCPBeliefWrapper that actually holds the tree used by the solver. This Wrapper can be used with any belief the user supplies. My implementation of POMCP can either use the particle filter as described in the POMCP paper, or, if analytical belief updates are possible, it can use those (in this case, the algorithm is not strictly POMCP, but is a simpler tree search).

I couldn't figure out a way to handle this strictly inside the solver - I think you have to use some kind of belief structure.

See https://github.com/sisl/POMCP.jl/blob/master/src/POMCP.jl and https://github.com/sisl/POMCP.jl/blob/master/test/crying_baby.jl (this implementation isn't perfect yet and may change)

On Wed, Aug 26, 2015 at 2:53 PM Mykel Kochenderfer notifications@github.com wrote:

I think some solvers are fundamentally tied to a particle-based representation (e.g., POMCP, DESPOT). These solvers would just keep track of their own belief representation internally (basically just an array of states). The root node can be sampled from the user-specified belief distribution. Does that make sense?

— Reply to this email directly or view it on GitHub https://github.com/sisl/POMDPs.jl/issues/18#issuecomment-135181725.

ebalaban commented 9 years ago

Again, that's fine, but wouldn't the user then need to have some idea of how the states are represented in the problem (in order to create that belief distribution)? For instance, in RockSample the integer-valued states could be created by combining information about the position of the rover and the status of the rocks in a specific way.

By the way, in this discussion for the purposes of 'user' I am imagining somebody who downloads our packages and wants to quickly try different solvers with different problems.

On 8/26/2015 2:53 PM, Mykel Kochenderfer wrote:

I think some solvers are fundamentally tied to a particle-based representation (e.g., POMCP, DESPOT). These solvers would just keep track of their own belief representation internally (basically just an array of states). The root node can be sampled from the user-specified belief distribution. Does that make sense?

— Reply to this email directly or view it on GitHub https://github.com/sisl/POMDPs.jl/issues/18#issuecomment-135181725.

mykelk commented 9 years ago

Yeah, I think "user" is typically used here to someone who is implementing a problem and wants it solved. The user will absolutely need to specify how to represent the state.

ebalaban commented 9 years ago

Ok, so even if we had somewhat different types of a user in mind, we kind of circled back to specifying initial belief as part of the problem code, in one form or another...

On 8/26/2015 3:20 PM, Mykel Kochenderfer wrote:

Yeah, I think "user" is typically used here to someone who is implementing a problem and wants it solved. The user will absolutely need to specify how to represent the state.

— Reply to this email directly or view it on GitHub https://github.com/sisl/POMDPs.jl/issues/18#issuecomment-135194752.

zsunberg commented 9 years ago

Right. I think the functionality that Edward is describing would be very useful, but it is of secondary importance in POMDPs.jl. It is not necessary for defining a problem or writing a solver. Perhaps there should be some kind of benchmarking interface in POMDPToolbox...

ebalaban commented 9 years ago

Let's see if we are on the same page here... I am assuming POMDPs.jl to be the basic interface specification that a problem creator should implement to make it possible to run the problem in some default fashion without any extra work. So, if a POMDP novice wants to get started with our packages, he/she should be able, just by using API calls, to do a full (even if very basic) init/solve/execute cycle. The default implementation of the problem can specify a uniform initial belief distribution, for example, or something else. If our POMDP novice wants to change that, he/she would need to understand the problem deeper and write a different initialbelief function. That would, in essence, create a new problem. Not having an initial belief function in the API kinda of breaks this paradigm.

You (Zach and Mykel) are fine with that and want to leave the creation of an initial belief for a specific problem entirely up to our hypothetical novice POMDP user. I, on the other hand, would like to see problem creators providing at least some sort of an initial belief structure that can then be either tweaked directly or regenerated through a new user-supplied initial_belief function for that particular problem type.

Did I capture our opinions accurately? :-)

On 8/26/2015 3:59 PM, zsunberg wrote:

Right. I think the functionality that Edward is describing would be very useful, but it is of secondary importance in POMDPs.jl. It is not necessary for defining a problem or writing a solver. Perhaps there should be some kind of benchmarking interface in POMDPToolbox...

— Reply to this email directly or view it on GitHub https://github.com/sisl/POMDPs.jl/issues/18#issuecomment-135200388.

mykelk commented 9 years ago

I'm not sure I understand your second paragraph, but I'm okay with having initial_belief in the API if that is the consensus (since it is used by many offline solvers). If they don't implement initial_belief and the solver they are using doesn't use it, then that's just fine.

zsunberg commented 9 years ago

I think you captured my opinion correctly in the second paragraph. Up to this point, I didn't consider the ability to run a very basic example just using API calls to be a requirement for the API. However, I do think it is very, very wise to make it exceedingly easy to run a first example. So maybe I should reconsider the requirements.

I think that the original subject of this issue and what we are talking about now are two different subjects. Originally, we were talking about the initial belief for which a solver produces a solution. Now we are talking about a function that conveniently creates a default belief to run a test simulation with.

I think the original problem should be solved with an initial_belief keyword argument for solve(), and the second problem may be solved by including a function called default_belief() or default_initial_belief(). What does everyone think? We might want to open a new issue for the default belief funciton.

On Wed, Aug 26, 2015 at 5:03 PM Mykel Kochenderfer notifications@github.com wrote:

I'm not sure I understand your second paragraph, but I'm okay with having initial_belief in the API if that is the consensus (since it is used by many offline solvers). If they don't implement initial_belief and the solver they are using doesn't use it, then that's just fine.

— Reply to this email directly or view it on GitHub https://github.com/sisl/POMDPs.jl/issues/18#issuecomment-135210315.

mykelk commented 9 years ago

Sounds good.

ebalaban commented 9 years ago

I think we've pretty much converged :-) The only thing I'd add is that for problems that, for one reason or another, are best solved with online solvers, b0 to me is pretty much part of the problem definition. So in those cases it goes somewhat beyond being just a test example.

That said, I am fine with calling that function 'default_initial_belief' ('default_belief' has a slightly different connotation, as in "I can't figure out what's going on right now, so this is my default belief"). As for the argument to 'solve', I'd probably just call it 'belief', like we do for 'action' (to keep them somewhat generic).

On 8/26/2015 5:48 PM, zsunberg wrote:

I think you captured my opinion correctly in the second paragraph. Up to this point, I didn't consider the ability to run a very basic example just using API calls to be a requirement for the API. However, I do think it is very, very wise to make it exceedingly easy to run a first example. So maybe I should reconsider the requirements.

I think that the original subject of this issue and what we are talking about now are two different subjects. Originally, we were talking about the initial belief for which a solver produces a solution. Now we are talking about a function that conveniently creates a default belief to run a test simulation with.

I think the original problem should be solved with an initial_belief keyword argument for solve(), and the second problem may be solved by including a function called default_belief() or default_initial_belief(). What does everyone think? We might want to open a new issue for the default belief funciton.

On Wed, Aug 26, 2015 at 5:03 PM Mykel Kochenderfer notifications@github.com wrote:

I'm not sure I understand your second paragraph, but I'm okay with having initial_belief in the API if that is the consensus (since it is used by many offline solvers). If they don't implement initial_belief and the solver they are using doesn't use it, then that's just fine.

— Reply to this email directly or view it on GitHub https://github.com/sisl/POMDPs.jl/issues/18#issuecomment-135210315.

— Reply to this email directly or view it on GitHub https://github.com/sisl/POMDPs.jl/issues/18#issuecomment-135221706.

mykelk commented 9 years ago

Cool. In the interest of shorter names without introducing ambiguity, is there a reason to not just call it initial_belief?

ebalaban commented 9 years ago

I am fine with that. From the argument (i.e. initial_belief(pomdp::POMDP)) I think it should be clear that what it returns is the default initial belief for that problem. And I think we've agreed that it should return an AbstractDistribution type object that can then be morphed into something concrete and solver-specific by a solver, right?

On 8/26/2015 9:22 PM, Mykel Kochenderfer wrote:

Cool. In the interest of shorter names without introducing ambiguity, is there a reason to not just call it |initial_belief|?

— Reply to this email directly or view it on GitHub https://github.com/sisl/POMDPs.jl/issues/18#issuecomment-135286647.

mykelk commented 9 years ago

I think you want create_belief and initialize_belief!(b) as suggested earlier. Right? Or we can do the trick where we do initial_belief(pomdp::POMDP; belief = create_belief(pomdp)), which returns belief?

ebalaban commented 9 years ago

Oh, I see. Sure, if we want to avoid memory allocation within 'initial_belief', we can do either of the two options below. I kind of like the second one. And yes, I meant Belief type rather than AbstractDistribution, sorry.

On 8/26/2015 9:41 PM, Mykel Kochenderfer wrote:

I think you want |create_belief| and |initialize_belief!(b)| as suggested earlier. Right? Or we can do the trick where we do |initial_belief(pomdp::POMDP; belief = create_belief(pomdp))|, which returns belief?

— Reply to this email directly or view it on GitHub https://github.com/sisl/POMDPs.jl/issues/18#issuecomment-135288838.

mykelk commented 9 years ago

Yeah, I think I like the second one too---especially if we decide to get rid of the ! functions as considered in #23.

zsunberg commented 9 years ago

Mykel, regarding your 9:22 comment, by calling it initial_belief, we are implying that the initial belief is part of the problem definition. Arguments for and against this are in my 1:19 post. If we want to make the initial belief part of the problem definition, then calling it initial_beliefis fine. If not, then we should call it something else.

mykelk commented 9 years ago

@zsunberg, I'm okay with it being part of the problem definition if that is okay with the rest of the group. I realize there are arguments for and against this, but I'm not swayed strongly either way.

ebalaban commented 9 years ago

It appears that there are no other objections to this. I'll go ahead and put this in belief.jl and close the issue. If anyone has additional concerns though, please feel free to reopen it.

mykelk commented 9 years ago

Good. Go ahead and close.