pyrddlgym-project / pyRDDLGym

A toolkit for auto-generation of OpenAI Gym environments from RDDL description files.
https://pyrddlgym.readthedocs.io/
Other
62 stars 16 forks source link

[Question] Forward simulation #259

Open MFaisalZaki opened 1 week ago

MFaisalZaki commented 1 week ago

I know that the library is aimed at RL-based planners. Is there any plan to support forward simulation that allows us to implement A*-based and MCTS planners? There is this project called plangym that provides such functionality, but it does not account for RDDL.

ssanner commented 1 week ago

Note: We're absolutely happy to answer questions about pyRDDLGym, but I would encourage general questions to go to the Discussions section.

pyRDDLGym is designed to support planners and compilations to different target languages used by a variety of planning methodologies. You can find a number of out-of-the-box planners (including the PROST MCTS planner for discrete RDDL domains) as sub-repositories of the top-level of the project.

It's important to note that pyRDDLGym can generally model stochastic discrete & continuous state and action MDPs (and POMDPs). Deterministic, discrete search methodologies such as A* would not readily apply in expressive stochastic, continuous MDPs (and this is why PROST MCTS only supports a discrete subset of RDDL). That said, we do provide planners for these more expressive continuous cases such as JaxPlan (a gradient-based planner) and GurobiPlan (a mixed integer optimization-based planner). All sub-repositories have linked readthedocs documentation (linked under "About") and for further reading, here is a recent paper discussing JaxPlan and GurobiPlan.

MFaisalZaki commented 6 days ago

Thanks, @ssanner, for highlighting this.

mike-gimelfarb commented 4 days ago

Thanks @ssanner for an excellent discussion and limitations!

Actually, there are progressive widening (PW) and other continuous-space extensions of MCTS that we've had some interest in exploring further, but we don't have the bandwidth currently for pushing in that (very nontrivial) direction. Any work that would even attempt to do this while leveraging the idioms of RDDL in any meaningful way would certainly qualify as novel research worthy of publication.

In case you are looking to implement your own code using pyRDDLGym as a foundation (or perhaps building a wrapper for pyRDDLGym to interface with existing PW-MCTS methods) and need a head start to see how it can interface with pyRDDLGym, you may want to look at this very simple prototype we wrote for v1.4.4 which used both gradients and a very basic PW-MCTS (it was subsequently removed because it is very nontrivial to scale and tune such algorithms for complex problems).

MFaisalZaki commented 4 days ago

Thanks @mike-gimelfarb. This sounds interesting. I will have a look and see what I can do with it. But can it solve the basic planning problems?