clarkmcc / cel-rust

Common Expression Language interpreter written in Rust
https://crates.io/crates/cel-interpreter
MIT License
362 stars 18 forks source link

Add support for using a dynamic variable resolver and a Value::DynamicCollection that can dynamically resolve nested values of collection types. #58

Open chirino opened 2 months ago

chirino commented 2 months ago

This is an improved version of #47 it has been updated to include the suggestions given in #47.

chirino commented 2 months ago

Added a slightly modified version of the benchmark @clarkmcc provided in #47 This variation lets you easily switch between static or dynamic resolvers. It constructs the Context within the iteration loop since that better compares the overhead of populating the context in the static case.

Dynamic resolution seems to be faster, about 20%:

     Running benches/runtime.rs (target/release/deps/runtime-6a18d2a40aded4e1)
variable resolution/variable_resolution_1
                        time:   [49.821 ns 49.889 ns 49.968 ns]
                        change: [-32.108% -31.969% -31.813%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe
variable resolution/variable_resolution_10
                        time:   [490.61 ns 495.56 ns 501.14 ns]
                        change: [-23.252% -22.414% -21.410%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
variable resolution/variable_resolution_100
                        time:   [5.6754 µs 5.7097 µs 5.7500 µs]
                        change: [-19.484% -18.817% -18.102%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  8 (8.00%) high mild
  1 (1.00%) high severe
clarkmcc commented 2 months ago

It's likely faster because we're comparing apples to oranges. In this benchmark you're really seeing the difference between a hashmap lookup to produce a Value::Null and skipping the hashmap altogether (as is the case with dynamic resolution). The problem with this is it doesn't accurately capture the overhead of only dynamic resolution. To me this just tells me that it's faster to return Ok(Value::Null) than it is to hashmap.get(variable) which makes perfect sense.

To properly benchmark the overhead of dynamic resolution, the hashmap lookup should serve as the control in the experiment. This means the dynamic resolution should also include the same hashmap lookup.

For the benchmark, what happens in the dynamic resolver doesn't really matter so much as that it does the same thing as the standard resolver. In practice, the dynamic resolver is almost always going to be slower than the standard resolver because, at least in your use-case, you'll be making HTTP requests in the dynamic resolver. So the benchmark should control for the hashmap in my opinion.

I believe these adjustments more accurately control for the resolver's overhead.

Index: interpreter/benches/runtime.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/interpreter/benches/runtime.rs b/interpreter/benches/runtime.rs
--- a/interpreter/benches/runtime.rs    (revision 38c263c5ea544cd8b2509853b9fe5bbe62741eaf)
+++ b/interpreter/benches/runtime.rs    (date 1720022327640)
@@ -1,5 +1,5 @@
 use cel_interpreter::context::Context;
-use cel_interpreter::{Program, Value};
+use cel_interpreter::{ExecutionError, Program, Value};
 use criterion::{black_box, criterion_group, criterion_main, Criterion};
 use std::collections::HashMap;

@@ -71,7 +71,7 @@
     let sizes = vec![1, 10, 100];

     // flip this bool to compare the performance of dynamic resolver vs static resolvers
-    let use_dynamic_resolver = true;
+    let use_dynamic_resolver = false;

     for size in sizes {
         let mut expr = String::new();
@@ -86,13 +86,22 @@
         group.bench_function(format!("variable_resolution_{}", size).as_str(), |b| {
             let mut ctx = Context::default();
             if use_dynamic_resolver {
-                ctx.set_dynamic_resolver(move |_, _| Ok(Value::Null));
+                let mut variables: HashMap<String, Value> = HashMap::default();
+                for i in 0..size {
+                    variables.insert(format!("var_{i}", i = i), Value::Null);
+                }
+                ctx.set_dynamic_resolver(move |_, v| {
+                    variables
+                        .get(v)
+                        .cloned()
+                        .ok_or(ExecutionError::UndeclaredReference(v.to_string().into()))
+                });
             } else {
                 for i in 0..size {
                     ctx.add_variable_from_value(&format!("var_{i}", i = i), Value::Null);
                 }
-            }
-            b.iter(|| program.execute(&ctx).unwrap())
+            };
+            b.iter(|| program.execute(&ctx))
         });
     }
     group.finish();
@@ -100,8 +109,8 @@

 criterion_group!(
     benches,
-    criterion_benchmark,
-    map_macro_benchmark,
+    // criterion_benchmark,
+    // map_macro_benchmark,
     variable_resolution_benchmark
 );

When measuring only the overhead of the dynamic resolver, we get ~50% performance regression

Benchmarking variable resolution/variable_resolution_1: Collecting 100 samples in estimated 5.0004 s (51Mvariable resolution/variable_resolution_1
                        time:   [95.403 ns 95.649 ns 95.907 ns]
                        change: [+53.123% +54.927% +56.336%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  6 (6.00%) high mild
  1 (1.00%) high severe
Benchmarking variable resolution/variable_resolution_10: Collecting 100 samples in estimated 5.0029 s (5.variable resolution/variable_resolution_10
                        time:   [980.32 ns 986.39 ns 992.87 ns]
                        change: [+47.167% +48.708% +50.222%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
Benchmarking variable resolution/variable_resolution_100: Collecting 100 samples in estimated 5.0526 s (4variable resolution/variable_resolution_100
                        time:   [11.307 µs 11.360 µs 11.416 µs]
                        change: [+51.871% +53.270% +54.583%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
chirino commented 2 months ago

So the use case of using http to resolve values was just an illustration. My actual use case looks more like I get a gRPC request with a large payload document. I want to run a CEL expression against that document. Converting that whole document to cel Values seems like more overhead than dynamically picking out the fields of that document based on the expression.

clarkmcc commented 2 months ago

That sounds like a more legitimate use-case.

How are you envisioning we'd preserve the spec's Field Selection requirements with a dynamic resolver present? In order for the dynamic resolver to be useful, you want the full path of the expression before you do any resolving. But that doesn't align with the spec and would result in different error behaviors depending on whether the dynamic resolver is added or not.

If we did follow the spec on field selection, then the dynamic resolver would need to traverse each layer of the object converting each layer to a CEL value, which doesn't really get you the wins that you're looking for.

chirino commented 2 months ago

I'll try to update the test_deep_dynamic_resolver test in this PR to be a better example of the usecase. Maybe you can help figure out the edge cases where if fails the spec expectations.

chirino commented 2 months ago

Ok.. the latest commits change things a bit, instead of a Value::Any we add a Value::DynamicCollection which holds a resolver closure. This made it a bit simpler to write the more realistic deep resolver test case. This also further simplified the resovler function signature.

chirino commented 2 months ago

Added a feature that integrates with Parc types which are like Arcs but which let you access deep references of the object via reference projections.

clarkmcc commented 2 months ago

That's pretty awesome actually! But I am concerned that they're pretty convoluted to use. Do you think a derive macro could generate this for a struct?

chirino commented 2 months ago

derive macro might be possible, but that should likely be a separate PR.

chirino commented 2 months ago

One thing I might want to do part of this PR is see if it's possible to provide a blanket impl a From<Vec where T: DynamicCollection which should simplify the vec indexing cases.

clarkmcc commented 2 months ago

derive macro might be possible, but that should likely be a separate PR.

Could be. I'm just looking at what it takes to use, and I don't see this feature being very useful if it requires all this manual boilerplate. Imo they're pretty tightly coupled. But if we can get a separate PR up at the same time then that works for me. What I don't want is to introduce a feature that is unwieldy to use and then never get around to cleaning up the user experience.

One thing I might want to do part of this PR is see if it's possible to provide a blanket impl a From<Vec where T: DynamicCollection which should simplify the vec indexing cases.

👍

chirino commented 2 months ago

@clarkmcc ok I think the last bits are in there. Could we get this in without the macro? Our project could take advantage of this as is.

clarkmcc commented 2 months ago

Let me play around with the PR now that it's ready to go, it's big enough I can't once-over it here in Github 😁

chirino commented 2 months ago

ok thx.

chirino commented 2 months ago

squashed and rebased to avoid merge conflicts.

clarkmcc commented 2 months ago

Thanks, I took a look at this on Saturday but didn't finish going through it. Still WIP

chirino commented 1 month ago

any updates?

clarkmcc commented 1 month ago

I'm working through some ideas based on #73. Seems like the right solution could solve #73, this, and what you're proposing here -- allowing CEL to get arc references to fields on Rust structs and skip serializing the objects altogether.

In other words, I'm not sure about this PR as it currently stands, but I want to take a lot of what you have here to improve this project on several different fronts.