CEL expressions - Githubissues

eguzki commented 6 days ago

Currently, conditions, matches (or whatever you want to call them), are expressed as a PatternExpressions

pub struct PatternExpression {                           
    pub selector: String,                                
    pub operator: WhenConditionOperator,                 
    pub value: String,                                   

    #[serde(skip_deserializing)]                         
    path: OnceCell<Path>,                                
    #[serde(skip_deserializing)]                         
    compiled: OnceCell<CelExpression>,                   
}

Which in configuration looks like:

matches:
    - selector: request.url_path
      operator: startswith
      value: /get

The ask here is to leverage the Common Expression Language to implement predicates that evaluate to bool.

Something like

predicates:
    - "request.url_path.startsWith('/get')"

Note that data type on the configuration: CEL expressions will be strings that need to be parsed.

eguzki commented 6 days ago

The envoy attriubutes are available through a WASM-ABI, which in rust, and for Envoy, looks like

pub fn get_property(path: Vec<&str>) -> Result<Option<Bytes>, Status>

There is no way to add all the attributes to the CEL context and make then available to expressions directly. So the question is about how to make envoy attributes (and I underline Envoy here) on the CEL expressions.

Approach 1: CEL functions to the rescue

predicates:
    - "get_property(request.url_path).startsWith('/get')"

Here. we have defined a custom function get_property, that will bind input param(s), i.e. envoy attribute path, to the WASM-ABI actual pub fn get_property(path: Vec<&str>) -> Result<Option<Bytes>, Status> method.

PROS: we can do this now CONS: expressions are "uglier", IMO though.

Approach 2: CEL Lazy values Or CEL dynamic values

I came across these two RFE on the cel-rust repo:

Dynamic value support https://github.com/clarkmcc/cel-rust/issues/43
Add Value::Lazy variant https://github.com/clarkmcc/cel-rust/issues/92

I would say both would enable us to implement CEL expressions in a clean way, i.e.

predicates:
    - "request.url_path.startsWith('/get')"

PROS: Clean and straightforward CONS: Not available today from cel-rust

The Envoy locking here is undesirable, but that's what it is in the current state and context.

alexsnaps commented 6 days ago

There is a third option, which is what actually happens today already:

parse the expression and replace member navigation in the AST directly. Today, when you have a condition that says request.url_path, it gets flatten to a single binding: attribute.

Also, there is more work that I think should happen as part of this: align authorino & limitador's lingo (or at least at the Kuadrant level). I was thinking having the so-called context: AttributeContext struct be reflected through all our CEL environments. wdyt?

alexsnaps commented 6 days ago

Also, I think having option 1, probably more like get_property("request", "url_path") tho - possibly type safe too, is something I'd do in any case. It's real straight forward, as it is exposing a sugar coated function from the abi to CEL, but mostly gives us a good fallback in case we need it. Finally, tho still be discusses how the Kuadrant Operator will inject these expressions in the config, that replacement could also be done prior configuration the wasm-shim... Which would then mean very different looking config too possibly tho

maleck13 commented 6 days ago

I like the function option. I don't believe adding this precludes us from allowing request.url_path.startsWith('/get') in a future release as we would maintain the function option but enhance our CEL support with the second option. I wont pretend to know everything about the innards of this, but I don't see any particular issue with the function as the first way to achieve this.

alexsnaps commented 6 days ago

And this is NOT public API here. We can absolutely replace request.url_path as it'd be expressed in a user facing Policy with get_property("request", "url_path") in the wasm's configuration as part of the validation happening in the Kuadrant controller.

alexsnaps commented 6 days ago

I think I have a contingency plan here: just have everything become cel expressions (but the routeRuleConditions (which will become routeRulePredicates then? that probably should use the get_property([]) function directly...). On parsing the expressions, we should calculate an "execution plan" for member navigation, i.e. what are all the props accessed and populate a Map with them... request.url_path can navigate such a map just as well as a Struct, so all that needs to be done is:

Parse expression
Get all the Ident & member navigation from the AST
Store that along side the Expression and
Populate the Map (of Maps, of... potentially) and inject it in the Context prior evaluation

eguzki commented 6 days ago

Something like this? From https://docs.rs/cel-parser/0.7.1/cel_parser/ast/struct.ExpressionReferences.html

let expression = parse("foo.bar == true").unwrap();
let references = expression.references();
assert_eq!(vec!["foo"], references.variables());

Interesting idea... could work. But we would need the entire variable instead of the first level. As in the example "foo.bar == true", the references.variables() method returns vector with one element: foo. We cannot infer the full path from it. Can we?

eguzki commented 6 days ago

Also, there is more work that I think should happen as part of this: align authorino & limitador's lingo (or at least at the Kuadrant level). I was thinking having the so-called context: AttributeContext struct be reflected through all our CEL environments. wdyt?

Interesting about AttributeContext. I guess the Wasm ABI would allow that with some API methods like get_http_request_headers and get_property. It would imply building a full request context for only needing few attributes, though. But why not :) benchmarking should tell.

EDIT: we are already doing that for auth service https://github.com/Kuadrant/wasm-shim/blob/main/src/service/auth.rs#L55

Makes sense to me

alexsnaps commented 5 days ago

So here's how this looks like:


let exp = cel_parser::parse(
   "request.headers.all(n, n in auth.allowedHeaders) && auth.identity.secure" // our predicate
).unwrap();

let mut all = Vec::default();
properties(&exp, &mut all, &mut Vec::default()); // gets all member navigation from the AST

assert_eq!(all[0], vec!["request", "headers"]);
assert_eq!(all[1], vec!["auth", "allowedHeaders"]);
assert_eq!(all[2], vec!["auth", "identity", "secure"]);

assert_eq!(all.len(), 3);

So the idea is to use that all: Vec (calculated once at parse time and stored along side the compiled expression), to populate a ValueType::Map that would contain the values of all the paths navigated.

This is from an actual passing test locally. Am now working on the second part, which is lazily populating all that data.

alexsnaps commented 5 days ago

@chirino I think this covers what you were trying to address with this PR... Anything you see that wouldn't be covered?

alexsnaps commented 5 days ago

Also, on that front, I wasn't up to date:

I was thinking having the so-called context: AttributeContext struct be reflected through all our CEL environments. wdyt?

We are NOT having context as root binding. I deleted it from the authorino CEL support. It's straight to metadata, request, source, destination & auth.

chirino commented 5 days ago

@chirino I think this covers what you were trying to address with this PR... Anything you see that wouldn't be covered?

Yeah. I think this would work. If understand correctly, you would use this to minimize the values you add into the cel context right?

alexsnaps commented 5 days ago

Yeah. I think this would work. If understand correctly, you would use this to minimize the values you add into the cel context right?

Yes, using a Map instead of a "lazy Struct, as the . dot member navigation works for both (i.e. map.key is the same as struct.field accessor, so we pretend the Map is the Struct ...)

eguzki commented 5 days ago

@alexsnaps I was thinking on start working on this issue. It is being unassigned, but I feel you already started. What's the current state?

alexsnaps commented 5 days ago

Yeah, I'm on it, should be done by EOD

eguzki commented 4 days ago

Looking forward to seeing the implementation of

properties(&exp, &mut all, &mut Vec::default());

alexsnaps commented 4 days ago

In rebase hell, but... I won't have you wait any longer ;)

fn properties<'e>(
    exp: &'e Expression,
    all: &mut Vec<Vec<&'e str>>,
    path: &mut Vec<&'e str>,
) {
    match exp {
        Expression::Arithmetic(e1, _, e2)
        | Expression::Relation(e1, _, e2)
        | Expression::Ternary(e1, _, e2)
        | Expression::Or(e1, e2)
        | Expression::And(e1, e2) => {
            properties(e1, all, path);
            properties(e2, all, path);
        }
        Expression::Unary(_, e) => {
            properties(e, all, path);
        }
        Expression::Member(e, a) => {
            if let Member::Attribute(attr) = &**a {
                path.insert(0, attr.as_str())
            }
            properties(e, all, path);
        }
        Expression::FunctionCall(name, target, args) => {
            if let Some(target) = target {
                properties(target, all, path);
            }
            for e in args {
                properties(e, all, path);
            }
        }
        Expression::List(e) => {
            for e in e {
                properties(e, all, path);
            }
        }
        Expression::Map(v) => {
            for (e1, e2) in v {
                properties(e1, all, path);
                properties(e2, all, path);
            }
        }
        Expression::Atom(_) => {}
        Expression::Ident(v) => {
            if path.len() > 0 {
                path.insert(0, v.as_str());
                all.push(path.clone());
                path.clear();
            }
        }
    }
}

eguzki commented 4 days ago

From https://github.com/clarkmcc/cel-rust/blob/master/parser/src/ast.rs#L155-L203, I see.

Looks like a good RFE for the Expression public API in the https://github.com/clarkmcc/cel-rust/ so we do not need to maintain this ;)

alexsnaps commented 4 days ago

Yeah... that's been an on-going discussion again lately... tho, really unsure how the "laziness problem" is best tackled from the interpreter's perspective. Plus there is bunch of work that I think should happen before that, e.g. proper protobuf interop. Which, for one would be a huge benefit to us, but would probably also impact how to best go about this... anyways... problem for future us, let's focus on our release for now 😛

Kuadrant / wasm-shim

CEL expressions #112

Approach 1: CEL functions to the rescue

Approach 2: CEL Lazy values Or CEL dynamic values