Closed lucperkins closed 4 months ago
Hm, what's an example of something that doesn't work without this PR? In this example, is the goal to share a Context
between two threads?
Why would you want to bind Context
to Axum state? Context is not intended to be thread-safe. There's really two concepts in this CEL implementation:
In the context of an Axum server, let's say that you were implementing some sort of auth validation or something with CEL. The program could be shared across Axum handlers (this would be the CEL logic for validating), but on each Axum request, you'd create a new Context and add variables like the request path, the client IP, etc which could then be reference in the thread-safe program.
This is sort of the mental model that I have for the difference between a Program
(thread-safe) and a Context
(not thread-safe).
Can you elaborate more on how your use-case differs from the design here?
Basically I want to be able to make authorization decisions for HTTP calls. Policy conditions, expressed as CEL, are supplied by admins and stored somewhere (database, in-memory, etc.). When users want to do something, a Program
is created on the fly and evaluated using a Context
. That Context
would consist of a corpus of built-in functions and variables but would be enriched with specific values for specific evaluations. So let's say admin Big Boss has created the CEL policy username != "bad-actor-420"
for some action. Then, one of the admin's employees wants to perform that action. The server determines that the employee works for Big Boss and fetches Big Boss's CEL condition. The Context
is enriched with a value for that user's username
and then the Big Boss's CEL policy is evaluated, with the result of the evaluation varying based on the user (so good-guy-steve
is allowed to perform the action but bad-actor-420
is not). So in this case, the Program
varies and, preferably, the Context
stays mostly the same.
Now, I'm not wedded to this model and if you think there's a better way to do things given those goals, I'm all ears. The only alternative that really comes to mind is building a brand new context for every request, which I would prefer not to do.
@lucperkins thanks for the explanation of the use-case. It makes perfect sense. In addition to making the functions Send + Sync
, we'd also need to make the Context itself Send and Sync, correct? What would you propose there? Wrapping everything in a Mutex?
You know what could be interesting... the context can either be a root context, or a child context (see blog post about it) where the child context just has a reference to it's parent which could be another child context, or it could be the root context. This is how we support variable shadowing in CEL macros like .map. When a variable is not found in a child context, we traverse upwards until we get to the root.
We could look at implementing copy-on-write semantics in order to make Context Sync
, where you can initialize your functions, your global variables that never change, etc. But then as soon as you add a variable, under the hood you get back a new child context that references the original parent context, allow you to share all the functions and globals, but be able to get a new context with just the additional variables you need for the request.
The only other question is whether we'd run into lifetime issues where the compiler won't be able to prove that the root context outlives the child contexts passed to the Axum handlers.
Anyways, that's sort of a brain-dump. Thoughts?
@clarkmcc First off, thanks a ton in turn for entertaining my suggestion and thinking it through so thoroughly. @cole-h has a maybe-solution to this where he has a struct that wraps Context
:
struct PolicyDecider(Context<'static>);
impl PolicyDecider {
fn new() -> Self { Self(Context::default()) }
fn evaluate(&self, condition: &str) -> Result<Value, ExecutionError> {
let mut inner_context = self.0.new_inner_scope();
inner_context.add_variable_from_value("someVar", Value::from("someVal"));
let program = Program::compile(condition)?;
program.execute(inner_context)
}
}
So basically using the mechanism that you're currently using for map
and others. Seems to be working fine in testing and at least seems to avoid any potential mutability-related issues. Does this seem reasonable to you? Anything stand out? If it does seem reasonable, I'd be happy to add some docs around this because I suspect others will have similar goals.
@lucperkins yes, this is exactly what I had in mind.
If this meets your needs, I think I'd also like to clean up the API a bit so that it's more obvious what functions to call to do exactly this. I'm also trying to think through whether we can just make this implicit behind the scenes -- i.e. context is always immutable and when you add a variable, it creates a child context on the spot and gives it back. I would need to figure out the best way to do that with support for batching variables into the context, and I would also need to figure out if doing this implicitly presents lifetime challenges for the user.
@clarkmcc I'd say this indeed meets our needs for now. More than happy to help out re: API design, so please don't hesitate to ping us. I'm closing this specific PR for now as I definitely agree about Send + Sync
. I continue to be impressed by this project!
@clarkmcc I take it back. I conferred with Cole again, and actually we want to re-propose this specific change. Even when carefully handling mutability as in the pattern I laid out above and using Arc<PolicyDecider>
, not having Function
be Send + Sync
breaks a lot of stuff in Axum (like error handling). We'll continue to explore other solutions here but for now, this would be a helpful unblocker for us.
Would this PR be sufficient though? Context is still not Sync
even with these changes I believe.
And if I could throw out something else for consideration: there's probably a good argument for registering functions with a Program
not with a Context
. The functions that are required are going to be coupled with the Program
that you're running, and there's not a good reason that I can think of that you'd want to take the same program and execute it multiple times with different functions in the context.
If we were to move functions to the Program
, and then you initialized a new Context
for each Axum request, putting into it only the required variables, would that meet your needs, or are there still benefits to sharing large portions of the Context
outside of functions?
@clarkmcc In our case, it's okay for Context
to not be Sync
.
The issue with your suggestion, I think, is that we don't know what the Program
will be until the request arrives. We ascertain who the user is and what they want to do, then we figure out which condition (CEL string) applies to them, then we update the context with values required for the evaluation, and then we evaluate. So we basically have a big chonky initial Context
that we enrich with a few variables on demand and the Program
is just a simple string that we pull from a database.
@lucperkins no big deal, but if you have a spare moment, it would be great if you could add a minimum reproducible example of your use-case to the examples directory. Since we're pre 1.0, I'd like to capture as many use-cases as possible so that any breaking changes going forward still support the legitimate use-cases that exist in the wild.
If you don't have the time, not a big deal at all.
@clarkmcc More than happy to!
This makes implementations of
Function
usable in multi-threaded contexts.I'm happy to put this behind a feature flag if desired.