plaans / aries

Toolbox for automated planning and combinatorial solving.
MIT License
43 stars 7 forks source link

Some questions and general thoughts on AI planning & acting and Aries #87

Closed nrealus closed 1 year ago

nrealus commented 1 year ago

Hello, Aries team !

I've known about this project for some time, since I've been following the work of Arthur Bit-Monnot with close attention for a while now. I got interested in AI planning and acting about 3 years ago, but I am not a professional (researcher or student) of the field. Perhaps you could say I am an "amateur" researcher. I am also very interested in the work on OMPAS, I hope there'll be papers about it at ICAPS 2023 !

Despite knowning about Aries for some time, I only recently decided to take a closer look at its architecture. I have been holding back on that because I am not proficient in Rust and only know SMT/SAT/CP on a basic level. Another reason was a little bit of skepticism I had about the way replanning and/or plan repair could efficiently be done in an integrated/online actor, using a "non-constructive" CP approach (like that presented in the paper on LCP [1]) as opposed to a "constructive" approach (like in FAPE [2], with explicit handling of flaws and resolvers). This was important to me, since my main interests in AI planning & acting are the (online) acting part, as well as the combined aspects of time, uncertainty, and goal reasoning. However, I recently changed my view on the integration of an actor with a "non-constructive" CP-based planner, which I now think is quite promising (but more on that later). Finally, your recent refactoring effort encouraged me to take a closer look at some of Aries' inner workings.

I am writing because I had some questions, thoughts, and ideas on AI planning & acting in general (mostly about integrated/online acting, time, uncertainty, and goal reasoning) and, of course, Aries and its research directions. I would be glad to hear your opinion on them. These questions and ideas actually pick up on those that we discussed with Arthur by email a few months ago. The insights were very valuable and helped me understand a lot of new things in the field.

First of all, I have one very simple question start with: what exactly is the "validator" folder/module that was recently added to the codebase ? Now, onto the big wall of text.

About the integration of planning, (online) acting, and goal reasoning

Until a month ago, I was actively thinking about ways to "seamlessly" incorporate planning and (online) acting with goal reasoning, based on the Goal Lifecycle line of work of Roberts et al [3],[4],[5],[6]. I was looking into ways that the FAPE actor or a RAE-like actor could be modified to use "Goal Strategies" (Formulate, Select, Expand, Commit, Dispatch, Continue, Repair, Replan, Defer, Reformulate) to try to unify planning, (online) acting, and goal reasoning. For instance, I was thinking about a principled way to encode "Goal Strategies" as separate "meta-level" skill handlers communicating with each other, and a new type of message, analogous to "planupdate", for each "Goal Strategy". However, I eventually gave up, considering this "unifying" idea way too complex (and unnecessary, despite the attractiveness), mostly because of difficulties in dealing with multiple concurrent "selected goals". I decided to further investigate another line of work on goal reasoning, which is "more classical" than Goal Lifecycles, as it builds on GDA (Goal-Driven Autonomy), but still draws many similarities to them. These are the "Goal Operations" of Kondrakunta et al [7],[8],[9]. The FAPE actor could be extended with GDA in quite a simple manner, calling some sort of "GDAProcedure". This "GDAProcedure" should be placed after observation merging, in place of vanilla FAPE's "new goal" integrations, which take place before "PlanUpdates" integrations. Indeed, in accordance with GDA, the GR decision integrations should be placed after the observation merging, as goal reasoning needs to know the discrepancies that "MergeObservations" (and "PlanUpdates") could introduce (see Figure 2 in [3]). This "GDAProcedure" could take as input the current plan chronicle (with already merged observations), as well as, possibly, the plan chronicle before "PlanUpdates" and observation merging, or even a trail of all the modifications to the plan chronicle. The output of the "GDAProcedure" could be a list of Kondarkunta et al.'s "Goal Operations". In the chronicle framework, the "Goal Operations" could be functions with preconditions (on the state of the actor, the expectations, or the goal agenda) as input and "phi_del" and "phi_add" "partial" chronicles as output, analogously to "PlanUpdates" of FAPE. The Task Modifiers of [10] could also be seen as a possible realisation of the idea of Goal Operations and the "GDAProcedure". The determination and choice of these "Goal Operations" could be done in a "rule-based" way (like in [9]) or even with reinforcement learning. Or even using classical planning in a space of "Goal Operations", similarly to what [11] aims at. However, there should be a way to "connect" the GR decisions with replanning and plan repair. The most straightforward idea is to simply integrate some sort of retrial of the "GDAProcedure", if the plan repair or replanning that follow are unsuccessful. Another idea is to branch over Goal Operations, as if it was a planning space. Yet another idea would be to mark elements (typically assertions), added to (and maybe even removed from) the plan chronicle as a result of a GR decision, with information on the origin of its addition of removal. This would be similar to how ObservationsMerge can mark assertions as "conflicting" with an observation, and then (during plan repair) introduce that observation as an "a priori" supported assertion. Also, this would allow connecting chronicle elements (assertions) to an "external" goal agenda, outside of the planning chronicle.

I should also point out that the reason I am considering the FAPE actor as a base in all of this, is its simplicity and very general capabilities, thanks to Skill Handlers.

Now how does Aries tie into this ? Well, as I said earlier, I was a bit skeptical about the usage of a "non-constructive" CP based planner as I thought that it would not allow as much flexibility and "on the fly" modifications to the plan chronicle (plan updates, observations merging, plan repairing and replanning, and, perhaps goal driven autonomy). But I now think that to the contrary, a "non-constructive" CP based planner like Aries should actually be even better for this. As using it could simply boil down to modifying the "initial chronicle" fed to the planner by the actor, for example by setting the presence variable of some elements of plan chronicle to false, to replicate the plan repair of vanilla FAPE (paragraph 5.5.4.1 in [2]). However, as I do not have a very strong background on SAT/SMT/CP solvers' inner workings, strengths, and weaknesses, I don't really know what to think. What is your opinion, would it still be possible to have the same (or better?) level of flexibility and integration between the actor and planner with a planner like Aries, compared to the vanilla planner of FAPE ?

About quality metrics, uncertainty, and online acting (timepoint dispatching)

Another particular interest for me is uncertainty (temporal, and maybe even more general), and online acting. My personal and non-expert feeling is that in most research, the focus on satisfying dynamic controllability, while indeed having a purpose, restricts the research on (approximately) optimal timepoint dispatching. Even in vanilla FAPE, (action start) timepoints are instantiated as early as possible, although online "optimal" dispatching remains possible thanks to the delegation of interior timepoint instantiation to skill handlers. Because it focuses on the latter, the line of work of St-Guillain [12], [13], is my main basis here. Until recently I was thinking about ways to integrate (in a non-naive way) his MCTS approach into the constructive search process of the vanilla FAPE planner. However, that doesn't seem very promising because of computational cost. Moreover, as Arthur points out in his work, there is quite a lot to be leveraged from CP/OR scheduling work (works of Laborie, CP Optimizer...). This consideration and my appreciation for St-Guillain's approach led me to read about Stochastic Constraint Programming (SCP) [14], [15], [16], as well as (continuous) Stochastic Satisfiability Modulo Theories ((c)SSMT, [17], [18]). These fields seem to still be in their infancy, despite having existed for a long time (SCP). Do you think that it could be possible to wrap the Aries solver into another solver, which would be able to deal with randomness/stochasticity ? Or maybe it could even be possible to extend Aries (with a custom reasoner, perhaps) to support that ? Even though I doubt that last idea, as I have a feeling supporting randomness/stochasticity like in SCP or CSSMT would require modifying the core of Aries as well (and which surely wouldn't be a good idea).

Sorry about the large walls of text, some of the convoluted and confusing wording, and certainly some unnecessary details. And thank you very much for your time and attention !

[1]https://homepages.laas.fr/abitmonnot/files/18-cp.pdf [2]https://oatao.univ-toulouse.fr/17704/1/Arthur%20Bit-Monnot.pdf [3]https://ojs.aaai.org/aimagazine/index.php/aimagazine/article/view/2800 [4]https://journals.flvc.org/FLAIRS/article/download/128553/130031 [5]https://apps.dtic.mil/sti/pdfs/ADA610455.pdf [6]Goal reasoning, planning, and acting with actorsim, the actor simulator [7]https://corescholar.libraries.wright.edu/cgi/viewcontent.cgi?article=3685&context=etd_all [8]https://www.dustindannenhauer.com/papers/aaai17.pdf [9]https://sravya-kondrakunta.github.io/9thGoal-Reasoning-Workshop/papers/Paper_5.pdf [10]https://arxiv.org/pdf/2202.04611.pdf [11]https://www.rogel.io/content/2-publications/cardona-rivera2022re-examining.pdf [12]https://ai.jpl.nasa.gov/public/documents/papers/MSG_Lila_IWPSS2021_paper_15.pdf [13]https://uploads-ssl.webflow.com/60d986c079713a00706079a0/60fa99924ed0880a131e639e_3-Probabilistic_Temporal_Networks__with_Ordinary_Distributions___JAIR.pdf [14]https://arxiv.org/pdf/0903.1152.pdf [15]https://arxiv.org/pdf/0903.1150.pdf [16]https://arxiv.org/pdf/1704.07183.pdf [17]Stochastic Satisfiability Modulo Theory: A Novel Technique for the Analysis of Probabilistic Hybrid Systems [18]CSiSAT: A satisfiability solver for SMT formulas with continuous probability distributions

EDIT:

After I wrote this post, I came across a recent PhD by AJ Wang (Risk-Bounded Dynamic Scheduling of Temporal Plans). The approach described in it is probably much better suited for Aries than what I was thinking of in the "About quality metrics, uncertainty, and online acting (timepoint dispatching)" paragraph I initially wrote.

nrealus commented 1 year ago

Discussed by email.