Improve evaluator to detect and perform early-exit if possible

open-policy-agent / opa

Open Policy Agent (OPA) is an open source, general-purpose policy engine.

https://www.openpolicyagent.org

Apache License 2.0

9.57k stars 1.33k forks source link

Improve evaluator to detect and perform early-exit if possible #2092

Closed tsandall closed 2 years ago

tsandall commented 4 years ago

Currently the evaluator does not implement any kind of "early-exit" (or short-circuit or find-one) behavior that stops evaluation when results are no longer needed. For example, given a rule set where all of the rules return the same value (e.g., allow = true { ... }), OPA will evaluate every rule returned by the indexer. In some cases, it would be simple enough to detect and stop evaluation of subsequent rules once one result had been found. A simple heuristic to start would be complete rules where that assign the same value (under different conditions.) For example:

default allow = false

allow = true { # logic }
allow = true { # logic }
allow = true { # logic }
allow = true { # logic }
...

Similarly, the evaluator should also be able to short-circuit any unnecessary backtracking in the case of iteration on data:

allow { input.groups[_] == "admin" }

If a single match is found, no backtracking needs to be performed.

For debug/test purposes, there should be a flag that lets users disable this behavior.

Note this was raised long ago in #234 but never got implemented.

tsandall commented 4 years ago

This was raised in the recent GK security audit.

srenatus commented 3 years ago

The issue of side effects was brought up: technically, any one of the allow { # logic } rule bodies could do an http.send(). Short-circuiting the evaluation could make those go away.

Similarly, the evaluator should also be able to short-circuit any unnecessary backtracking in the case of iteration on data

👍 Those would be safe cases: anything iterated on from the base documents (the base doc portion of data, and all of input) would be OK to short-cut, there cannot be any side effects there.

For general short-cutting the evaluation of virtual document rules, a new keyword mirroring every (#2090) could be introduced:

allow {
  any x { input.container[x] }
  startswith(input.container[x].image, "example.com")
}

It would have the effect that after the first x was bound successfully, and the rule body expressions evaluated successfully, too, no further attempts would be made.

☝️ Picked "any" here for illustration purposes, and because another some would probably be confusing.

tsandall commented 2 years ago

@srenatus I've filed an issue to improve the docs around http.send because I don't think we should have to assume that http.send has side effects--it should absolutely not be relied on for that. AFAIK, http.send is the only case that we have today like that--so if we assume everything is side effect free, we don't have to introduce any new syntax into the language.

If we make this assumption then the evaluator just has to know when it can short-circuit. Evaluation of complete docs where the value is constant is the simplest case though I could imagine others:

Function evaluation when the result is a constant
Membership tests on partial sets and set comprehensions

I'd think the analysis could be implemented in the compiler and the result could be cached. When the interpreter looks up rules during complete doc evaluation, it could check the compiler data structure to see if early-exit is allowed. If early-exit is allowed, it could stop iterating once a result was found.

Perhaps the early-exit bit could be stored on the rule index lookup result. That would work for both cases of early exit (i.e., early exit on rule sets and early exit on iteration within a single rule.)