pstlab / oRatio

oRatio is an Integrated Logic and Constraint based solver
Apache License 2.0
6 stars 1 forks source link

Various questions #21

Closed nrealus closed 3 years ago

nrealus commented 3 years ago

Hello,

I am a master-level student who discovered and got very interested in the field of AI planning during this summer. I have read your thesis on oRatio and timeline-based planning, as well as a fair amount of other papers on AI planning and temporal networks published in the past 20 years. However, the subject isn't really related to my studies, and I am not really interested to pursue academic work in it - for now at least ! As such, I am no specialist at all, but as an engineer, I do find the subject to be extremely interesting, important and promising for real-world applications.

First of, I'd like to say that your work on oRatio appealed to me the most out of all recent planners I've encountered in academic literature. I think it is great in its design and architecture : it is rich, modular, extensible, elegant but also robust, compact and non-restrictive for the user. Finally, I feel like it can be truly used to approach a very wide range of problems, possibly by extending and "specialising" it if necessary (with more advanced heuristics, types, smt theories...).

The reason why I am writing is because I would like to have some clarifications and input on a few issues and questions that came to my mind. Indeed despite all the reading, I'm not a specialist, and there are things I am not really sure I'm figuring out or getting correctly.

There it is, sorry for the very long post. Thank you in advance, and best of luck with this very promising project !

riccardodebenedictis commented 3 years ago

Thank you very much for the questions, which give me the opportunity to clarify some aspects!


Version 2.9 aims to clean up some aspects of the code, fix bugs and make the graphical interface "more modern". In particular, the old version used prefuse as a library for viewing graphs, a library which has not been maintained for decades and which creates problems with the most recent versions of java. This version also aims to have more organized benchmark problems so that it is easier to demonstrate the qualities/properties of the solver.

All of these changes aim at developing a more effective heuristic. The current heuristic, in particular, is inspired by the classical planning heuristic h_max, which is a heuristic from about twenty years ago. Adapting that heuristic to this form of reasoning was not very trivial, but now that it has been done it is possible to adapt other classical planning heuristics so as to make the solution process much more efficient (hence the need to have better organized benchmarks and a graphical interface that allows to understand how heuristics behaves).


Regarding uncertainty, I did not investigate it deeply. It is definitely possible to introduce an additional variable to each atom for representing its observability, etc. and it would also be possible to implement a STNU as a SMT theory. This STNU, however, should be disjunctive (to manage possible alternative plans). I don't know any algorithm that handle this problem and creating a new one could be an interesting research problem. What I would do, however, is to extend the linear arithmetic theory so as to handle optimization. Most of that code basically comes from the simplex method, so it should be easy to extend it to manage optimization. Specifically, since constraints are linear, I would assume that minimizing/maximizing a temporal variable in a (partial) solution would give you the variable's bounds and, if they are strictly contained in the original ones (i.e., those in the problem) then the events associated to the variable are no more free to happen "whenever they want" (so, the uncertainty is not guaranteed and the solver has to find for another plan). But I have to think more deeply on that. Introducing optimization might be useful also for other reasons.

More in general, however, the causal graph contains a representation of different plans that represent solutions for the given planning problem. In case of an unexpected event (any kind of event, not necessarily a delay!), it would be possible to switch from one plan to another.


It is certainly possible to update the domain at runtime adding new fields, new methods, new predicates etc.. Currently, in order to do that, you have to inherit from ratio::type (I will add some simple documentation for that, but the methods' names should be kinda self explanatory). You have some example of that in the reusable resource. The question is: why would you do that? Types such as state-variables and reusable-resources implement the get_current_incs() procedure which allows to make the corresponding timelines consistent. Types that do not implement this method can probably be defined much more easily in riddle. Most of the code, additionally, is the one defined within the rules (when, in riddle, predicates are defined) and, writing it in cpp, could be a hard work. Yet it is probably me who does not see the usefulness! So, if you need it, I would ask you to please explain better what you have in mind :wink:.

The biggest problem, however, comes in case of unifications. If you dynamically introduce new fields, specifically, you might have different instances of the same type having different fields, and I would not know how to force them to be equal..

From instances of ratio::type, however, you can already retrieve fields, methods, predicates, etc., all of them in bulk, or singularly by their names. It is also possible to retrieve instances of objects which are created, for example, in riddle. You can, for example, create ratio::solver instance. Through the read you can read some riddle script (e.g., a script for creating a real variable: real x0;) and then retrieve it from the solver's get method (e.g., get("x0)";). The returned arithmetic expression can be evaluated (e.g., solver.arith_value(expr)) or constrained with other arithmetic expressions.

    solver slv;
    slv.read("real a, b;");

    auto a = slv.get("a");
    auto b = slv.get("b");
    auto c = slv.geq(slv.add({a, b}), slv.new_real(1));
    slv.assert_facts({c->l});
    slv.solve();

    auto a_val = slv.arith_value(a);
    auto b_val = slv.arith_value(b);

Notice that a and b are real variables, while c is a bool variable. When the c fact is asserted (i.e., the c variable is forced to assume the true value), the constraint a + b >= 1 is enforced. If you retrieve the variables' values you will get that their sum is greater or equal than 1 (in my case, a_val is 1/1 and b_val is 0/1). Perhaps the API can be cleaned up (e.g. by overriding the arithmetic operators on the arithmetic expressions), but I hope the above code gives an idea of what is possible to do.


As regards execution, as you correctly guessed, the hard part is related with plan repair. All the references should be available for adapting a plan from a solution. If the repair regards some delay, it should be possible to introduce the delay as a constraint and call again the solve() method. The causal graph, additionally, maintains causal relations among atoms! So, in principle, it should be possible to introduce dynamically new atoms or to remove (actually, hide them) the existing ones. By calling the solve() method, the causal graph should adapt, exploiting information that previously led to the generation of the solution (hence, the adaptation should be efficient). Unfortunately, nothing of that has ever been tested.. so if you have some practical (possibly simple :D) example, I am more than happy to test it and make it work!


I hope that I answered most of your questions! If not, feel free to ask again! There is, for sure, plenty of room for improvement.. Discussions like this are also very valuable to me! :grinning: I am the only one working on this, so changes might require some time, yet if they reflect some real need or some specific research issues, I am more than happy to introduce them :wink:

nrealus commented 3 years ago

Thank you very much for your answer, it did help me understand things better !


About uncertainty : I had the impression there was recently quite a lot of activity on the subject of DTNUs (Disjunctive TNUs) and even PODTNUs (partially observable) and MADTNU (multi-agent). Maybe you've already come across them, but there were very interesting recent papers by Nikhil Barghava like this one and this one. Maybe you could find elements that would interest you there, but then again, my understanding may be a bit off.

However, I am not sure I understood correctly what kind of optimization you were talking about as a possible extension for the LRA-theory, and how exactly it would help dealing with uncertainty...


About runtime domain edition : Thank you for the clarifications and example. It is really great that the c++ code can work so easily with type instances of the planning domain so easily. And my conclusion is that it would be best to communicate with the solver using RIDDLe "snippets", correct ?

To clarify what I had in mind when asking this question : I was mainly thinking of an "online" situation, where the plan would be executed in real-time (basically, during "acting"). And I was thinking about the possibility of modifying and/or extending the domain during execution, for example, based on exogenous events or information received from "outside" (another program, module) to adapt to the "environment".

Another way to interpret what I had in mind was close to "generative planning", where we could add/modify "decomposition methods" (in HTN terms) for an action. If I understand correctly, in oRatio that could correspond to adding or changing predicates for a state variable or a propositional agent, which could be achieved in the way you described : by making the solver read new RIDDLe snippets. In my mind, that could be done either manually, or, for example, by a "higher-level" machine learning based module, which would analyze the environment and react to it by formulating suitable goals etc in RIDDLe. One of the papers that made me think of such a situation was this one. However, what they are addressing is more of a "plan first then learn" situation, and what I was describing is closer to "concurrent planning (actually acting) and learning".

I hope this was a bit clearer :grinning: !


As to the last point : I don't have a practical / simple example to give you (yet ! :D), but I now realise that what I just described is actually closely related to plan-repair... Also, what exactly do you mean by "delay" (as a reason to repair) ? Do you mean the delay in time of a token (representing an event) ? If so, this looks close to the uncertainty problem we discussed earlier... Perhaps uncertainty and "reactive-acting" or plan-repair could be approached in a generalised homogenous manner ? If I recall correctly, this is something that FAPE more or less suggested, but I may be off on that.


I am glad that my questions and interest can help you in your research !

If I happen to get a better grasp of oRatio, even though there are things I understand only in a very basic way (SMT, specifically), I would be glad to try contributing with a few code or design suggestions on the issues discussed here. However, I am unsure I'll be able to put the effort and time delving in this effectively advanced research field as it takes a lot of time and I have main university studies in parallel...

riccardodebenedictis commented 3 years ago

Thank you for the references!

I will try to go deeper and better understand the problems faced. Apparently, indeed, DTNUs might be fine! But solving these problems is EXPTIME :hot_face:. In such cases I would start to consider rather moving to reinforcement learning (if don't want/need guarantees) or to formal methods (if you need guarantees).

When talking about optimization I am referring to linear programming. The idea (perhaps a naive one, I should study more this topic... :man_facepalming:) would be to "simply" find a planning solution and then check its controllability by forcing, through optimization, its failure (and, in case, backtrack to another solution). Consider that temporal reasoning aspects are currently managed thgough general linear constraints and not through temporal networks (even though I have recently added an option for choosing between the two), hence managing uncertainty for the more general linear case could be an added value.

Although, however, I am pretty sure that most of the above uncertainty/controllability techniques could be implemented/integrated in oRatio (and I would be more than happy if someone would do it! :grin:), I'm currently moving on another direction: in case of exogenous events, try to adapt the plan and, if it fails, be quick (thanks to an effective heuristic) to replan. This would also answer your question about dynamically changing the domain. Why would you need it if you can efficiently replan from scratch? You would just create a new planning problem, with the updated domain, and let the solver provide a solution for it.

nrealus commented 3 years ago

Right, I understand what you mean !

You're right, a more "straightforward" approach to deal with uncertainty and acting (allowing, in particular, to support behaviour close to "plan repair") could indeed work very well. I may have unconsciously assumed that such an approach would be unfeasible due to extreme inefficiency. But I have probably underestimated the potential performance of such "straightforward" approach.

Basically, this is a "closed-loop feedback" approach to acting, inspired by control theory, which I have found mentioned in litterature too (I don't have access to the references right now). We would update the domain and/or solve again from scratch, after detecting an "error" (again, control theory analogy) i.e. "inconsistencies in general". And if I'm guessing correctly, the "effective heuristic" you're talking about would reduce the search space of the solving procedure based on what the "inconsistency" could affect, possibly by analysing the causal graph, right ? Or if it doesn't reduce the search space, it could be use to sort the choices made during the search. Maybe the linear-programming style optimization idea to extend the LRA-theory could also be used to allow the solver to better deal with this ?

I do like this "closed-loop" approach, it is relatively straightforward and independent / decoupled from the planning part, without being too brutal, inefficient and without introducing needless complexity in the general design (or so it seems).

However, I'm wondering what exactly would these inconsistencies be, and how could we detect them - will it be enough to analyse the causal graph or check the temporal consistency of the timelines ? I'm not completely at ease with all of the (timeline planning) formalism used in oRatio yet, so I'm probably missing something here. Maybe updating the domain and resolving from scratch should be done after absolutely any event, without checking whether it introduces an inconsistency ? But my feeling is that this would be way too inefficient and unexploitable in non-basic cases... What do you think about this ?

This "closed-loop" approach to acting doesn't seem to be that complicated and massive (at least from afar) and doesn't involve changing much stuff in the current code, as it could be a separate module. I'll see if it's something I could be able to play around with. It would also be an "hands-on" occasion to get more familiar with your code. Of course, I'll try communicating with you if I manage to produce something even remotely interesting !

Also, there could be other questions or wanted clarifications that may cross my mind in the future... But I promise it won't become annoying, or at least I'll try so it doesn't ! πŸ˜„

riccardodebenedictis commented 3 years ago

Interactions help at making things better, so don't worry, you're not annoying :wink: (but please, let's focus on more specific topics, otherwise I have problems at answering you :sweat_smile:)

The aim of the current heuristic, based on the causal graph, is both to prune the search space, by early detecting infeasible regions, and to suggest to the solver the most promising choices. As I said, the current heuristic is not the best, and it does not allow yet the solver to scale on more complex instances, but it is a starting point for the definition of more promising heuristics!

Since the current heuristic aims at solving more complex (from a complexity theory point of view) problems than temporal ones (oRatio planning is, in general, undecidable, while DTNU is EXPTIME), I would say that the temporal aspects in the solution process have a marginal role (from a complexity theory point of view, this role is, asymptotically, null). In particular, managing uncertainty using standard algorithms (one of the papers you cited refers to this for the resolution algorithm) would have the drawback of losing the information that comes from the definition of the planning problem. In other words, by reasoning from a DTNU point of view you would lose the information related to the causality among the tokens/atoms, resulting, probably, in a less efficient resolution process (even if it is difficult to say a priori, without having empirical results in hand). For this reason, the management of uncertainty must be strictly integrated with the heuristics based on the causal graph.

Perhaps most of the work could be done by the solver and by the causal graph, reducing the management of the uncertainty to a STNU, which is much more manageable and also easier to implement πŸ˜„

nrealus commented 3 years ago

I think I see what you mean, but I'm a bit confused... would that be included in the optimization approach you mentioned, or were you speculating about it as a separate idea ?

And if I'm guessing correctly, what you just described would imply that atoms and tokens should be labeled as controllable or contingent, right ? And the STNU would then be included as a theory for the sat core, in order for it to be able to reason on atoms.

Sorry, if I'm making you repeat stuff πŸ˜†

riccardodebenedictis commented 3 years ago

You're right, I messed things up.

oRatio 2.9 can manage temporal information in two alternative ways: Linear Arithmetic (LA uses the lra_theory, which can manage disjunctions of linear arithmetics) or Difference Logic (DL uses the dl_theory, which is, basically, a disjunctive temporal network).

The heuristic, however, works at a higher level, ultimately resulting in the choices of the disjuncts, both in the LA and in the DL case, according to the structure of the specific planning problem (that emerges from the application of the rules and results in the construction of the causal graph).

So, in order for oRatio to manage uncertainty, there are two possible ways (at least, these are the ones I see):

  1. Extend the lra_theory to manage optimization in a "linear programming" fashion. The lra theory is based on the simplex method, so extending it should be a rather easy task. By minimizing/maximizing (combinations of) variables it should be "easy" to check the controllability of a (partial) solution. Uncertainty would be naturally extended also to dimensions which are not strictly temporal (e.g., the amount of consumption of a resource). Finally, optimization could be exploited for other tasks such as, trivially, linear programming.
  2. Extend the dl_theory to manage uncertainty. This theory, basically, currently maintains a STN. In addition, however, information about inactive temporal constraints is propagated to the causal graph (and, hence, to the heuristic is affected) and, in case of conflicts, an explanation is generated (i.e., a negative cycle). This STN should be "just" extended as a STNU.

I think that it would be interesting to implement both, in order to make a comparison. In any case, in order to manage uncertainty, atoms/tokens (and/or constraints) should somehow be labeled as controllable/contingent.

(btw, you can choose the temporal network by setting the TEMPORAL_NETWORK_TYPE parameter, at compile time, to either LA or DL)

I hope I have clarified the point. If not, don't hesitate to ask πŸ˜‰

nrealus commented 3 years ago

Aah, I see ! Yes this is much more clear ! Thank you very much for all the explanations and for your time !

I will try to play around with this and maybe attempt to implement something for this matter.

In any case, I am very much excited and looking forward to any advancements in this project !

Until next time, good luck ! πŸ˜„