cedar-policy / cedar

Implementation of the Cedar Policy Language
https://www.cedarpolicy.com
Apache License 2.0
898 stars 80 forks source link

Increase PE precision for lazy expressions #880

Open aaronjeline opened 6 months ago

aaronjeline commented 6 months ago

Category

Cedar language or syntax features/changes

Describe the feature you'd like to request

In order to have typing preservation, #874 reduced the precision of partial evaluation when evaluating the lazy expressions, such as the RHS of &&/|| and the consequent/alternative of conditionals. This is issue is a marker that we should attempt to re-improve precision while still keeping type preservation. Consider the following expression as a running example:

if { foo : { bar : unknown("a") } }.foo.bar then 1 + 1 + INT_MAX else 5 + 5

Currently, PE will reduce to the following:

if unknown("a") then 1 + 1 + INT_MAX else 5 + 5

PE cannot simply return the error that evaluating 1 + 1 + INT_MAX produces, as unknown("a") may be substituted by false, resulting in 10 being the correct total evaluation. Below are three "levels" of precision we can add. I'm pretty sure they are sound, but obv a lean formalism is required.

Level 1: Simplify, if error: rollback

At this level, we simplify lazy expressions, and if we get a value or residual: great!, if we get an error, we roll back to the original un-evaluated expression. Results:

if unknown("a") then 1 + 1 + INT_MAX else 10

Level 2: Stop at error

At this level, we evaluate up until we reach an error, and then roll back one "step". Results:

if unknown('a") then 2 + INT_MAX else 10

Level 3: (if-exclusive) both error optimization.

Another precision increase is to note that if both branches error, than regardless of substitution we will get an error. Since PE doesn't guarantee to preserve the exact error obtained, this is fine. Example:

if { foo : { bar : unknown("a") } }.foo.bar then 1 + 1 + INT_MAX else 1 + "hello"

should simplify to error. Note that we can't do this for values/residuals without typed unknowns, as a being anything but a bool would result in an error.

Describe alternatives you've considered

Leave as is, current partial evaluator is sound, just sub optimally precise

Additional context

No response

Is this something that you'd be interested in working on?

cdisselkoen commented 6 months ago

Looks great, agree we should do this.

Note that we can't do this for values/residuals without typed unknowns, as a being anything but a bool would result in an error.

Since we're talking about the case where both branches result in an error anyway, couldn't we still simplify the whole if-then-else to an error?

aaronjeline commented 6 months ago

Since we're talking about the case where both branches result in an error anyway, couldn't we still simplify the whole if-then-else to an error?

Yes, I was making the comment that the residual if unknown("a") then 1 else 1 cannot be simplified to 1 because a might not be a bool. We can only make the simplification in the case that both branches are errors.

cdisselkoen commented 6 months ago

Got it

0x00A5 commented 4 weeks ago

I discussed about the reduction in PE precision with @aaronjeline on Slack a few days ago, and just want to share it here as well that this regression in precision is a hard blocker for us to upgrade our Cedar dependencies above v3.3, since our cedar-to-SQL translation library relies on the previous behaviour.