clarkmcc / cel-rust

Common Expression Language interpreter written in Rust
https://crates.io/crates/cel-interpreter
MIT License
362 stars 18 forks source link

Can't reference a variable named like a function #53

Closed alexsnaps closed 2 months ago

alexsnaps commented 2 months ago

This simple expression, taken straight from the spec, fails: size(requests) > size

Reproducible test case:

let program = Program::compile("size(requests) > size").unwrap();
let mut context = Context::default();
let requests = vec![Value::Int(42), Value::Int(42), Value::Int(42)];
context.add_variable("requests", Value::List(Arc::new(requests))).unwrap();
context.add_variable("size", Value::Int(42)).unwrap();
program.execute(&context) // Err` value: ValuesNotComparable(Int(3), Function("size", None))

While as per the doc:

the first size is a function, and the second is a variable.

Here both size result in the function. And the variable gets overshadowed by the function always.

alexsnaps commented 2 months ago

I'll look into this for a bit...

Same with custom functions:

let program = Program::compile("siz('requests') > siz").unwrap();
let mut context = Context::default();
context.add_function("siz", siz);
context.add_variable("siz", Value::Int(42)).unwrap();
let value = program.execute(&context) // `Err` value: ValuesNotComparable(Int(8), Function("siz", None))
clarkmcc commented 2 months ago

Okay, looks like a combination of the AST and how the interpreter resolves Member expressions.

Program {
    expression: Relation(
        Member(
            Ident(
                "size",
            ),
            FunctionCall(
                [
                    Ident(
                        "requests",
                    ),
                ],
            ),
        ),
        GreaterThan,
        Ident(
            "size",
        ),
    ),
}

When resolving a member expression, we resolve the left side independently of the right side. https://github.com/clarkmcc/cel-rust/blob/5d38c5ef777b8ade5c417f0d3ff9066cadbd61f9/interpreter/src/objects.rs#L498-L501

Looking through a few different possible solutions.

clarkmcc commented 2 months ago

The following patch fixes the problem, but I want to spend a little more time and see:

  1. If something like this should be resolved in the parser instead.
  2. If not, let's harden this code a little more and make sure this actually is correct.
Index: interpreter/src/objects.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/interpreter/src/objects.rs b/interpreter/src/objects.rs
--- a/interpreter/src/objects.rs    (revision 5d38c5ef777b8ade5c417f0d3ff9066cadbd61f9)
+++ b/interpreter/src/objects.rs    (date 1718824267225)
@@ -496,8 +496,17 @@
                 .into()
             }
             Expression::Member(left, right) => {
-                let left = Value::resolve(left, ctx)?;
-                left.member(right, ctx)
+                let value = if let Member::FunctionCall(_) = right.as_ref() {
+                    if let Expression::Ident(name) = left.as_ref() {
+                        Value::Function(name.clone(), None).member(right, ctx)?
+                    } else {
+                        unreachable!()
+                    }
+                } else {
+                    let left = Value::resolve(left, ctx)?;
+                    left.member(right, ctx)?
+                };
+                Ok(value)
             }
             Expression::List(items) => {
                 let list = items
@@ -520,13 +529,7 @@
                 })
                 .into()
             }
-            Expression::Ident(name) => {
-                if ctx.has_function(&***name) {
-                    Value::Function(name.clone(), None).into()
-                } else {
-                    ctx.get_variable(&***name)
-                }
-            }
+            Expression::Ident(name) => ctx.get_variable(&***name),
         }
     }

@@ -787,7 +790,7 @@

 #[cfg(test)]
 mod tests {
-    use crate::{objects::Key, Context, Program};
+    use crate::{objects::Key, Context, Program, Value};
     use std::collections::HashMap;

     #[test]
@@ -863,4 +866,14 @@
         let value = program.execute(&context).unwrap();
         assert_eq!(value, false.into());
     }
+
+    #[test]
+    fn test_function_and_variable_name() {
+        let mut context = Context::default();
+        context.add_variable("size", 1).unwrap();
+        context.add_variable("requests", vec![1, 2, 3]).unwrap();
+        let program = Program::compile("size(requests) > size").unwrap();
+        println!("{:#?}", program);
+        assert_eq!(Value::Bool(true), program.execute(&context).unwrap());
+    }
 }
alexsnaps commented 2 months ago

On the parser comment, I'm a bit confused by your statement...

This is indeed what I expected:

Program {
    expression: Relation(
        Member(
            Ident(
                "size",
            ),
            FunctionCall(
                [
                    Ident(
                        "requests",
                    ),
                ],
            ),
        ),
        GreaterThan,
        Ident(
            "size",
        ),
    ),
}

Did this quick test too:

    let program = Program::compile("size(requests) > siz").unwrap();
    let mut context = Context::default();
    let requests = vec![Value::Int(42), Value::Int(42), Value::Int(42)];
    context.add_variable("requests", Value::List(Arc::new(requests))).unwrap();
    context.add_variable("siz", Value::Int(42)).unwrap();

which indeed results in

Program { expression: Relation(Member(Ident("size"), FunctionCall([Ident("requests")])), GreaterThan, Ident("siz")) }

The parser "only" creates a FunctionCall node on (I guess) the parenthesis? The interpreter makes a wrong guess on the rhs operand, no? So confused about the option 1:

If something like this should be resolved in the parser instead.

What do you have in mind?

clarkmcc commented 2 months ago

What do you have in mind?

Without thinking too hard, something like this might be easier on the interpreter. But I actually didn't have to write the parser so there are undoubtedly challenges with this approach that I haven't yet considered. This is just what I meant by "solve it in the parser".

FunctionCall{
    name: Ident("size"),
    args: [
        Ident(
            "requests",
        ),
    ]
}
alexsnaps commented 2 months ago

Right, so that size('123'), instead of

    expression: Member(
        Ident(
            "size",
        ),
        FunctionCall(
            [
                Atom(
                    String(
                        "123",
                    ),
                ),
            ],
        ),
    ),

becomes

    expression: 
        FunctionCall(
            Ident(
                "size",
            ),
            [
                Atom(
                    String(
                        "123",
                    ),
                ),
            ],
        ),
    ),

And promote FunctionCall as an Expression, is that right?

I think it makes sense... at least when reading the call-site. This if let Member::FunctionCall(_) = ... almost wants to be expressed as another branch to the match expr than the Expression::Member(left, right) it is in... So thinking that having FunctionCall being an Expression (with the form you laid out), seems very much reasonable to me, wdyt?

clarkmcc commented 2 months ago

Yeah, that's exactly what I had in mind. Implementation is a different question but I think this idea makes sense from the interpreters perspective.

alexsnaps commented 2 months ago

Also, what would requests.contains(42) where requests = [1, 2, 42] look like? Today that's:

Program {
    expression: Member(
        Member(
            Ident(
                "requests",
            ),
            Attribute(
                "contains",
            ),
        ),
        FunctionCall(
            [
                Atom(
                    Int(
                        42,
                    ),
                ),
            ],
        ),
    ),
}

Which would hit that else { unreachable!() } with the patch above... I think it remaining an Member makes sense... but with rhs being Expression::FunctionCall then? 🤔

alexsnaps commented 2 months ago

Read the go source a bit... from there it seems that'd be a baseCallExp which would either have a Target expression or not. i.e. requests.contains(42) vs size('hello, world!')... I can take a crack at doing that way too in the parser and adjust the interpreter. So it would look close to what you had in mind, but not quite:

FunctionCall: Expression {
    name: Ident,
    target: Member, (optional)
    args: Ident[]
}

wdyt?

A little confused about this isMember bool that seems to simply be derived from the target field 🤔

clarkmcc commented 2 months ago

Yes, that approach looks good if you'd like to give it a shot!

As far as the IsMemberFunction method since it's an interface method, that would be the only way besides calling Target to determine if a concrete implementation of the interface has a target, and I wonder if the helper method is provided because Target is a potentially expensive operation depending on the implementation? Just a guess, but I think your assessment is correct, this could be derived from the target value.

alexsnaps commented 2 months ago

Cool, I'll look into it... but at the earliest next week. Just don't interpret my silence as if I'd disappeared. Going into a week off from work next week. But will still give this some love, if only occasionally.

alexsnaps commented 2 months ago

But will still give this some love, if only occasionally.

I lied... I didn't write a single line of code. Back tho...