oracle-samples / clara-rules

Forward-chaining rules in Clojure(Script)
http://www.clara-rules.org
Apache License 2.0
1.2k stars 115 forks source link

Revisiting condition binding analysis #373

Open WilliamParker opened 6 years ago

WilliamParker commented 6 years ago

Problem statement

Currently, analyze-condition takes conditions, including arbitrarily nested ones, and returns information on bindings that the condition creates and ones that it needs to consume from a preceding condition. This information is used to sort conditions into the order in which they will appear in the rules network. I believe this functionality currently has a number of shortcomings and that we need to rethink it in order to resolve them.

(defquery myquery "" []
                    [:or
                     [:Type1 (= ?x (:field this))]
                     [:Type2 (= ?y (:field this))]])

clara.test-rules> (-> myquery :lhs first com/analyze-condition)
{:bound #{?x ?y},
 :unbound #{},
 :condition
 [:or
  {:type :Type1, :constraints [(= ?x (:field this))]}
  {:type :Type2, :constraints [(= ?y (:field this))]}],
 :is-accumulator false}

This poses problems if it causes this condition to be sorted before conditions that need to consume ?x or ?y since it will only create one of ?x or ?y, not both.

Or take a more complicated case, where each branch of an :or needs to consume a binding, but a different binding in each branch:

 (defquery myquery "" []
                    [:or
                     [:Type1 (= ?x (:field this))
                      (some-predicate? ?y this)]
                     [:Type2 (= ?y (:field this))
                      (some-predicate? ?x this)]])
clara.test-rules> (-> myquery :lhs first com/analyze-condition)
                      {:bound #{?x ?y},
                       :unbound #{},
                       :condition
                       [:or
                        {:type :Type1,
                         :constraints [(= ?x (:field this)) (some-predicate? ?y this)]}
                        {:type :Type2,
                         :constraints [(= ?y (:field this)) (some-predicate? ?x this)]}],
                       :is-accumulator false}

I don't see any way to do a topological sort on something where there are multiple possibilities for what sets of bindings must be consumed from predecessors and which can be created.

(-> (dsl/parse-query []
                        [[:not [:and
                            [:x (= ?z (:z this))]
                            [:y (= ?z (:z this))]]]])
                      :lhs first com/to-dnf)
(:or
    [:not {:type :x, :constraints [(= ?z (:z this))]}]
    [:not {:type :y, :constraints [(= ?z (:z this))]}])

This will then fall into the :or branch of the leaf condition analysis in analyze-condition and the :not conditions will be iterated over. Since they are :not conditions, they will not be considered eligible to bind variables. However, their variables will still be considered unbound. As a result, ?z will be considered an unbound variable of the parent negation.

clara.test-rules> (-> my-neg-query :lhs first com/analyze-condition)
{:bound #{},
 :unbound #{?z},
 :condition
 [:not
  [:and
   {:type :x, :constraints [(= ?z (:z this))]}
   {:type :y, :constraints [(= ?z (:z this))]}]],
 :is-accumulator false}

Since there is an unbound variable that isn't bound by another condition in the query, sorting its conditions will throw an exception.

 clara.test-rules> (-> my-neg-query :lhs com/sort-conditions)
ExceptionInfo Using variable that is not previously bound. This can happen when an expression uses a previously unbound variable, or if a variable is referenced in a nested part of a parent expression, such as (or (= ?my-expression my-field) ...).
Note that variables used in negations are not bound for subsequent
                                    rules since the negation can never match.
Production:

Unbound variables: #{?z}  clojure.core/ex-info (core.clj:4593)

Proposed solution

I propose that:

[A (= ?1 ...)]
[:or [B (= ?1 ...)]
      [C  (= ?2 ....)]]

The "A" condition is a "predecessor" of the :or condition, and the B and C conditions are both descendants of the :or condition. The effect here is that, viewing the top-level conditions as "black boxes" without concern for their contents, their bound variables will be all variables that are certain to be bound if the condition matches and the unbound variables are all variables for which there is any way the condition would require them.

(defrule nil-binding-rule
                    [:or
                     [First (= ?x 1)]
                     [Second (= ?y 2)]]
                    =>
                    (println "X: " ?x " --- " "Y: " ?y))
clara.test-rules> (-> (mk-session [nil-binding-rule]) (insert (->First)) fire-rules)
X:  1  ---  Y:  nil
#object[clara.rules.engine.LocalSession 0x7ac42172 "clara.rules.engine.LocalSession@7ac42172"]
clara.test-rules> (-> (mk-session [nil-binding-rule]) (insert (->Second)) fire-rules)
X:  nil  ---  Y:  2
#object[clara.rules.engine.LocalSession 0x5dc53c9d "clara.rules.engine.LocalSession@5dc53c9d"]

This seems like an accident of implementation that doesn't make much sense to me as a rule-writing pattern though. However, we could support patterns where a binding might not be created from a fact condition. I'm envisioning something like the following:

[?info <- [OtherInformation]]
[?newest <- (newest-fact-accum) :from [SensorReading]]
[:let [?sensor-reading-calculation (.... ?newest ... ?info ....)]]

IMO that kind of notation has a clear and intuitive meaning, especially since core Clojure construct such as for, doseq, etc. use the same notation. I chose an accumulator example since creating bindings for later use from an accumulator result is a pattern that isn't easily supported currently, but you could use such a construct in other ways as well e.g. to explicitly create a nil binding for a binding missing in one branch of an :or condition. My initial instinct wouldn't be to block these changes on this enhancement since the existing patterns it would support are strange and probably not common, but I could be convinced otherwise or change my mind on further thought.

Issues that this approach solves or unblocks

Related issues not solved

I think the goal here should be to establish a logical framework in which our various oddities in the handling of boolean expressions in rules can be resolved rather than simply solving a single case, although implementation may come in stages. The axioms around bound vs unbound variables I described seem consistent to me, but if anyone can poke holes in them I'd rather deal with any inconsistencies I'm not considering now than after implementing a significant change to the compiler like this. Similarly, I'd be interested to hear any thoughts on potential negative impacts on rule-writing patterns of these proposals.

rbrush commented 6 years ago

Some background on the current behavior. We do currently only sort the top level, since that's the only level where users (or generating code) specifies a collection of conditions, as opposed to an explicit boolean expression as done in children. I liked the property of the rule engine being order independent in general, so the process generating a collection of conditions wouldn't need to be concerned with their ordering. That said, I'm not necessarily opposed to weakening this guarantee if it's creating troublesome corner cases. I don't think attempting to sort nested conditions is really worthwhile, since they are an always a structured expression rather than a bag of conditions.

On another topic: the use of nil for un-bindable values was a conscious decision because I felt that it was the Least Surprising Behavior to someone not familiar with the system. Copying the example from above, I think a new user who sees the below post will expect it to compile and the unbound value to be nil, rather than throwing some "Unable to bind value" exception. I think there is precedent for the current behavior in other engines as well.

(defrule nil-binding-rule
   [:or [First (= ?x 1)]
          [Second (= ?y 2)]]
   =>
  (println "X: " ?x " --- " "Y: " ?y))

Of course, my interpretation of Least Surprising Behavior could be debated here.

I do think the central idea of this proposal of reducing the amount of transformation we do with nested expressions has merit, particularly in the binding issues seen in the transformation of [:not [:and ...]] -type structures. So I might lean towards something like this:

I think this covers most of the unbound errors described here.

mrrodriguez commented 6 years ago

@rbrush

I liked the property of the rule engine being order independent in general, so the process generating a collection of conditions wouldn't need to be concerned with their ordering

Maybe I should know this, but I can't think of what reason there is to re-order the conditions as we do now. Do you have a specific reason?

I know that if the rule was written in way that a binding was used prior to being unified to something like:


(defrule rule
  [B (< ?x y)]
  [A (= ?x x)]
  =>
  :foo)

Then the order that the network propagated should have these conditions reversed. However, to me, I always considered the above rule to be written incorrectly.

I've actually never considered rules written in the "wrong order" like this example to be valid. I haven't even tried to write them that way much to test out that theory.


A separate point that relates to this discussion:

We have had it come up a few times in Slack discussions, etc, about Clara potentially being a bit "lazier" in how it evaluates constraints. I think that in particular this is in reference to the alpha network evaluation.

Consider the example:

(defrule r1
  [A (= ?x x)]
  [B (= ?x x) (expensive-thing y)]
  =>
  (insert! (->C)))

In this case, if only facts of type B were inserted (no A facts), the expensive-thing function would be executed anyways. This could lead to a large waste of time if there were a lot of B and perhaps never any A.

You can refactor the rule manually as:

(defrule r1
  [A (= ?x x)]
  [B (= ?x x) (= ?y y)]
  =>
  (insert! (->ABJoin ?x ?y)))

(defrule r2
  [ABJoin (expensive-thing y)]
  =>
  (insert! (->C)))

However, that isn't always obvious. Also, a more complex rule is harder to break up like that.

Note that this lazier evaluation of conditions idea is implemented in Drools 6+ (the Phreak algorithm as they call it). There is more complexity in how it is done there, but the general idea is similar.

To keep some record of conditions that have prior on the appropriate network nodes. These become the "dependency" conditions for a node. With that information, the alpha network (or anywhere else that it applies) could delay execution until all dependencies were satisfied. The caveat to this is that all facts would need to be held that may still match the unevaluated logic. I think this would result in the trade-off of the alpha network not filtering as many facts as it possibly could, for the sake of execution time.

In the presence of condition re-ordering and/or rule conditions being written in no particular order, I think something like "lazier" evaluation becomes more difficult.

mrrodriguez commented 6 years ago

the use of nil for un-bindable values was a conscious decision because I felt that it was the Least Surprising Behavior to someone not familiar with the system.

I've thought that this was a reasonable idea as well. When using an :or in a rule, there is more utility in it if you can use bindings from each branch. I think it is more confusing to get "unbound variable" errors. Binding to nil has a idiomatic Clojure "feel to it".

mrrodriguez commented 6 years ago

Any time we encounter a negation we immediately extract that into its own rule with a NegatedResult, prior to any processing, and do so recursively.

This seems like it would be the simplest solution to the problem with negation condition extraction. It doesn't make sense to do analysis on the conditions prior to extraction since analysis does a to-dnf on these conditions, which is invalid in the first place.

Edit: the rest of this was irrelevant so I updated the comment.

mrrodriguez commented 6 years ago

@WilliamParker

I chose an accumulator example since creating bindings for later use from an accumulator result is a pattern that isn't easily supported currently

I didn't understand this I don't think. Accumulator result bindings have been used in subsequent conditions quite a bit in rules I've seen. Are you referring to cases like #357 ? It seems strange to me that a user wouldn't be able to expect a rule like you have in that example to just work, without this :let construct etc.

mrrodriguez commented 6 years ago

@WilliamParker In your proposal, the first bullet that starts with

We reimplement condition analysis on boolean expressions to ..etc..

I don't understand how it relates to #343 , which is about negation conditions. As discussed in above, if all negations are (recursively) extracted prior to analysis, they will no longer show up during the analysis of conditions phase (not as nested negations that is, they'll be their own group of separate conditions).

rbrush commented 6 years ago

Alright, I'm fine with dropping our top-level topological sort. There is one additional case where it was helpful, but we can probably solve it another way. Consider this rule:

(defrule accum-with-nothing    
    [?all-a <- (acc/all) :from [A (=?x x)]]
    [B (= ?x x))]
    =>
   (do-something-with ?x))

The current sorting behavior pushes the accumulator to be below [B (= ?x x))]. That way ?x will be bound even if the accumulator didn't match anything (in which case ?all-a is an empty sequence). If the accumulator is run first it may or may not bind ?x -- since acc/all will still activate if ?all-a is empty -- so we'd need to handle the possibility of valid-but-unbound ?x from parent nodes in the network.

So we'll need to handle that edge case if we drop the sort, but otherwise I think it'll be okay.

mrrodriguez commented 6 years ago

@rbrush Thanks for the example in https://github.com/cerner/clara-rules/issues/373#issuecomment-357256518

I see how that could be tricky. I'm not sure what the semantics should be there. If the accumulator created a result, ?all-a from no facts, then ?x doesn't have "meaning" for other conditions yet. So should the B condition be able to evaluate or should it prevent the rule from being satisfied?

e.g. If you inserted only B type facts, should the rule fire with ?x bound to whatever B happens to have on each match?

In the case of = that may make sense. However, what about:

(defrule accum-with-nothing    
    [?all-a <- (acc/all) :from [A (=?x x)]]
    [B (< ?x y))]
    =>
   (do-something-with ?x))

Here, the < test would fail on nil.

I wonder if this issue is fully isolated to accumulators. I think it is since they can allow propagation with no facts, and therefore not "respect their bindings".

WilliamParker commented 6 years ago

I can see an argument for having nil bindings not present in one branch just be nil when the user tries to use them when not present if we're talking about usage in the RHS. My concern is more around logical oddities that arise when a condition is sorted as being able to create variables, but actually doesn't, and later conditions actually dispatch on and can actually create that variable. I created one such scenario in issue 374. Perhaps we could create some idea of "things that definitely create a variable get sorted above those that might" or of sorting after the transformation into DNF is complete rather than before. I'll have to give that more thought.

WilliamParker commented 6 years ago

So, some more thoughts:

One example of a more complicated case:

[:or [?1 <- (acc/all) :from [A (= ?2 field)]]
     [?2 <- (acc/all) :from [A (= ?1 field)]]]
[?1 <- (acc/all) :from [C (= ?2 field)]]

Note that we could still apply performance optimizations like moving accumulators down in the rule conditions in many cases if we wrote logic to detect cases where doing so would be safe, which in practice is probably the majority of cases. That said, it would be a breaking change, which I'm not excited about.. but if it is the best option and will get us on a logically sound footing for the future might be worth it. FWIW, I can't recall communications or docs suggesting this as a pattern, and I suspect most of the time people would write this with the accumulator below the bindings it consumes since this is a more logically "top-to-bottom" flow IMO.

(defrule nil-binding-rule
   [:or [First (= ?x 1)]
          [Second (= ?y 2)]]
   =>
  (println "X: " ?x " --- " "Y: " ?y))

The troublesome aspect of this to me is what happens when ?x or ?y is used in subsequent conditions. For example, in the case described in issue 374. I'm toying with the idea of allowing users to use this sort of pattern in the RHS, but restricting when this pattern can be used in the LHS to cases where we can reliably determine a clear logical meaning. One possibility that has occurred to me is preventing a later condition from "creating" either ?x or ?y, although this is a pretty broad restriction and perhaps we could come up with something more narrow. Maybe even just restrict the "maybe-create, then use, then create" pattern in issue 374. I'd be interested in others thoughts on such possibilities.