clarkmcc / cel-rust

Common Expression Language interpreter written in Rust
https://crates.io/crates/cel-interpreter
MIT License
362 stars 18 forks source link

Dynamic value support #43

Open chirino opened 3 months ago

chirino commented 3 months ago

Would be nice if you could dynamically values in case your data set is too large to provide all upfront via context variables.

I'm think of something like:

// You can resolve dynamic values by providing a custom resolver function.
let external_values = Arc::new(HashMap::from([("hello".to_string(), "world".to_string())]));
let resolver = Some(move |ident: Arc<String>| {
    let name: String = ident.to_string();
    match external_values.get(name.as_str()) {
        Some(v) => Ok(Value::String(v.clone().into())),
        None => Err(ExecutionError::UndeclaredReference(name.into())),
    }
});

let mut ctx = Context::default();
ctx.set_dynamic_resolver(resolver);
assert_eq!(test_script("hello == 'world'", Some(ctx)), Ok(true.into()));

I think the above should be doable without too many changes to the current apis and allow you to dynamically resolve simple variables.

But I don't think that enough, it would be nice if you could also write expression like this:

assert_eq!(test_script("github.orgs['clarkmcc'].projects['cel-rust'].license == 'MIT'", Some(ctx)), Ok(true.into()));

For that to work I think the resolver would need to return a Value that is treated like a Map but whose members are dynamic but it's data is backed by a user defined data type. I think I'm saying the above may need Value to carry a generic variant for custom data types. Thoughts?

clarkmcc commented 3 months ago

Huh, very fascinating idea... at first glance I wouldn't have a problem with something like this. It's effectively no different than registering a function that returns some brand new data not already in the context.

Your second suggestion is less likely to happen because

I think what we could do is add a find function that is similar to filter, except that it only returns the first match. Then you can chain accomplish this same thing without ambiguity and in a single expression.

chirino commented 3 months ago
  • It would be a non-backwards compatible departure from existing CEL-spec syntax.
  • It ambiguously matches objects presumably that are in an array. What property in the array are we matching "clarkmcc" on? If it's a map instead of an array, what property is keyed? Why would it be keyed by name vs ID?

Not sure what you mean by the "ambiguously" orgs is not an array, it's is a map that has a "clarkmcc" key the gives back an organization object.

You can think of that example as just using a bunch of nested maps since maps can be accessed via index or field members.

BTW i've got a branch with an implementation of the first idea to pass.. https://github.com/chirino/cel-rust/blob/dynamic-resolver/interpreter/src/functions.rs#L750-L756

and with some ugly hacks it can handle the second case too, but I think it's kinda painful: https://github.com/chirino/cel-rust/blob/dynamic-resolver/interpreter/src/functions.rs#L770-L797

clarkmcc commented 3 months ago

Not sure what you mean by the "ambiguously" orgs is not an array, it's is a map that has a "clarkmcc" key the gives back an organization object.

If it's just a map of maps, then isn't github.orgs['clarkmcc'].projects['cel-rust'].license already supported out of the box today? "If e evaluates to a map, then e.f is equivalent to e['f'] (where f is still being used as a meta-variable, e.g. the expression x.foo is equivalent to the expression x['foo'] when x evaluates to a map." CEL Specification

BTW i've got a branch

Thanks! I'll take a look.

chirino commented 3 months ago

That's why I was confused about the "CEL-spec syntax" comment. The parser is not being changed, and if the values resolve as expected then it should work.

The issue here is that we don't have enough memory to fill the context with all the data needed to represent what is available e on git hub. So an expression like github.orgs['clarkmcc'].projects['cel-rust'].license would need to be dynamic based on which map fields are being accessed. The whole github.orgs['clarkmcc'].projects['cel-rust'].license expression would ideally result in only 1 HTTP request against github. Like when github is requested, idealy I resove it to some custom value that lets me know you asked for the github API, then the .org I would want a function that then get custom value and determines if the .orgs member is valid for for it.. if so then I resolve to a new custom value that keeps track of that fact your accessing orgs. and so on. Would be nice if the custom value could a user defined type.

Right now my branch is being forced to hack things by storing data invalid map key entries. which is not very typesafe, and not very memory efficient.

clarkmcc commented 3 months ago

Okay I see. The thing you need language support for is the lazy evaluation of that entire map traversal so that somewhere in a Rust function, you can resolve the value of that entire expression all at once, correct? Basically you're wanting this expression to be a query (for lack of a better term) that you can provide the resolution logic for?

Aside from convenience or aesthetics, is this appreciably different than a custom function that you use like this?

github_project('clarkmcc', 'cel-rust').license
chirino commented 3 months ago

Eventually that is the call I want to make, but as members are being resolved It would be nice if I could error you out early if you try to access invalid member.

Like if you have a typo, github.orgs['clarkmcc'].prAjects['cel-rust'].license then I would want to inform you prAjects is an invalid member.

chirino commented 3 months ago

Trying to update Value to take a generic resulted in the APIs drastically changing everywhere. So I tried a different approach and instead used an Any type to hold a custom user type. Seems much cleaner to support. Feature branch is at at

https://github.com/chirino/cel-rust/tree/any-value

The dynamic resolver impl to handle deep paths seems cleaner than before:

https://github.com/chirino/cel-rust/blob/any-value/interpreter/src/functions.rs#L773-L792

clarkmcc commented 3 months ago

Help me understand the purpose of the any type? We already support passing anything that implements Serialize and it's automatically converted under the hood to a Value.

Also, can you open a draft PR so I can see what changed?

chirino commented 3 months ago

Could that deal with what the dynamic resolver is doing in: https://github.com/chirino/cel-rust/blob/any-value/interpreter/src/functions.rs#L773-L792 ?

clarkmcc commented 3 months ago

Can you open a PR so that I can check it out and play with it? I'll run some tests to make sure I'm understanding.

chirino commented 3 months ago

see: https://github.com/clarkmcc/cel-rust/pull/47