open-policy-agent / opa

Open Policy Agent (OPA) is an open source, general-purpose policy engine.
https://www.openpolicyagent.org
Apache License 2.0
9.62k stars 1.33k forks source link

topdown: function cache treats 1.0 same as 1; even if cached function does not #5845

Closed srenatus closed 1 year ago

srenatus commented 1 year ago

ℹ️ Sorry the description is murky -- I'll update it when I understand better what's what here.

With this policy, x.rego:

package test
import future.keywords.if

test_foo {
    "1" == f(1)
    "1" == f(1.0)
}

f(s) := s if is_string(s)
else := x if {
    x := sprintf("%d", [s])
    not contains("%!", x)
}

We get this:

$ opa test -t wasm x.rego   
x.rego:
data.test.test_foo: FAIL (47.62675ms)
--------------------------------------------------------------------------------
FAIL: 1/1
$ opa test x.rego           
PASS: 1/1

Note that the topdown eval of this test passes -- while it should not! Adding a print(x) to the else branch,

else := x if {
    x := sprintf("%d", [s])
    print(x)
    not contains("%!", x)
}

we get this output:

$ opa test -t wasm x.rego   
x.rego:
data.test.test_foo: FAIL (50.983724ms)

  1
  %!d(float64=1)

--------------------------------------------------------------------------------
FAIL: 1/1
$ opa test x.rego -v      
x.rego:
data.test.test_foo: PASS (641.423µs)

  1

--------------------------------------------------------------------------------
PASS: 1/1

Furthermore, removing the first statement of test_foo makes the test fail with both evaluators.

:mag: Looking at the compiler's output, we find

package test

test_foo = true {
    data.test.f(1, __local3__)
    "1" = __local3__
    data.test.f(1.0, __local4__)
    "1" = __local4__
}

f(__local0__) := __local0__ {
    is_string(__local0__)
} else = __local1__ {
    sprintf("%d", [__local0__], __local5__)
    __local1__ = __local5__
    not contains("%!", __local1__)
}

Which does not look wrong.

Update

This is what happens: when the second expression is evaluated, topdown will check if the result is already cached. In that lookup, 1.0 and 1 are compared as ast.Terms, and are equal.

The problem is that the function breaks our number type abstraction -- which says that 1 and 1.0 is the same number.

srenatus commented 1 year ago

I'll close this as self-inflicted: If your function reaches beneath the surface, weird things can happen.