Open srlk opened 2 years ago
Thanks for the detailed report. I'm not sure there is much we can do here, though. There's complexity that we can't make disappear when using dynamic composition, so your expectation perhaps just can't be met.
That said, an increase like that, given the increase of policies, doesn't seem that bad. The step from 10 policies to 10k is huge.
The problem here isn't dynamic policy composition though, but the combination of that and the Compile API, no? I loaded the test data provided by @srlk and queries over the data API are still ~1 ms or so, which makes sense given how the query is essentially just a hash lookup. What makes the Compile API different in that regard? π€
Hey @anderseknert
You are absolutely right, the issue happens when dynamic composition is used together with Compile API. It's not only latency increasing - also the memory usage is increasing with Compile API.
When a regular query is done, there's no slowdown we are observing.
For a workaround, instead of creating a single main.rego
entry point, we have created multiple main endpoints main.type1.rego, main.type2.rego, main.type3.rego ...
package main.type1
denies[x] {
x := data.policies["type1"][input.subtype][_].denies[_]
}
any_denies {
denies[_]
}
allow {
not any_denies
}
This keeps Compile API performance manageable, until remaining dynamic part of the policy (input.subtype in this example) increases in the number of objects.
Given that the inputs used for policy composition are known, which they are in this case, and replacing those with constants render expected performance characteristics... yes, this looks like a bug to me.
Any updates here, @philipaconrad? π
@anderseknert The situation is better overall, now that #5307 is merged. It sped up type checking during Compile operations, and pushed out the "degradation zone" for policy compilation up to around 2500+ policies, instead of ~1000 policies.
However, I can't say the issue is entirely resolved; there are still issues where the Golang slice allocator blows up during type checking, and that will require some serious refactoring work to resolve. It's why I didn't mark #5307 as resolving this issue. :sweat_smile:
Thanks @philipaconrad! @srlk would be interesting to see what your numbers look like testing this with OPA v0.46.1 π
Closing this given the updates in https://github.com/open-policy-agent/opa/pull/5307. Feel free to re-open if issue still exists.
@ashutosh-narkar The allocation explosion in the typechecker still exists at larger policy sizes, based on the benchmarks in #5757, and #5307. We should reopen this, until we address the underlying problem.
@philipaconrad / @ashutosh-narkar any plans to restart the work on this task? We're very keen to reduce the compile API evaluation time even by small margins.
@AdrianArnautu we kept this issue in the backlog and intend to investigate further. I don't have a timeline atm but we'll look to address this in the next few releases. If you're interested feel free to work on this and we're happy to help.
Hello
We are using dynamic policy composition to evaluate policies. We have over 1000 policies and a part of them are evaluated to find final decision (allow/deny)
We have noticed a significant slowdown on Compile API as the number of policies are increasing, even though majority of the policies are not evaluated with the help of dynamic policy composition.
Short description
OPA version: 0.43.0
I have created a gist to replicate the behavior on a simplified version of our policies. More details on next section.
Steps To Reproduce
This will create the main.rego
And 10 policies with different package names
Expected behavior
Expected behavior is not to have a performance penalty on compile API when dynamic composition is used.
Additional context