open-policy-agent / opa

Open Policy Agent (OPA) is an open source, general-purpose policy engine.
https://www.openpolicyagent.org
Apache License 2.0
9.52k stars 1.32k forks source link

Performance Degrade since v0.67.0 on glob.match with lots of patterns #6908

Open amirsalarsafaei opened 1 month ago

amirsalarsafaei commented 1 month ago

Short description

upgrading opa from version v0.66.0 to v0.67.0 can cause performance hits due to the solution in #6846. imagine this policy:

package smth.authz

import data.users
import data.routes

allow := if {
    route
    user.permissions[route.permission]
}

route := matched if {
        some route in routes
        glob.match(route.path, ["/"], input.path)
        matched := route
}

I have a fixed large number of path so memory leak isn't a concern here but the cap causes re calculation of the regex pattern and in result a significant performance loss. I was able to achieve 25-50% better latency in 99.99% percentile increasing the cap size to my data size (benchmarks are attached at the end).

I can think of 2 possible solutions:

I'd be open to work on a PR and a solution I actually have done one on opa-envoy-plugin and would like to contribute to the community:)

Benchmark Results:

command: bench --bundle bundle-test.tar.gz --input input.json 'data.smth.authz.allow' --e2e --count 10

*the bundle is created with optimization level of 0 because using optimization caused more than 100x latency but that's another issue that I'm investigating:D

Opa v0.67.0:

+-------------------------------------------------+---------------+
| samples                                         |           402 |
| ns/op                                           |       2863306 |
| B/op                                            |       1655651 |
| allocs/op                                       |         36653 |
| histogram_counter_server_query_cache_hit_75%    |          1.00 |
| histogram_counter_server_query_cache_hit_90%    |          1.00 |
| histogram_counter_server_query_cache_hit_95%    |          1.00 |
| histogram_counter_server_query_cache_hit_99%    |          1.00 |
| histogram_counter_server_query_cache_hit_99.9%  |          1.00 |
| histogram_counter_server_query_cache_hit_99.99% |          1.00 |
| histogram_counter_server_query_cache_hit_count  |           402 |
| histogram_counter_server_query_cache_hit_max    |          1.00 |
| histogram_counter_server_query_cache_hit_mean   |          1.00 |
| histogram_counter_server_query_cache_hit_median |          1.00 |
| histogram_counter_server_query_cache_hit_min    |          1.00 |
| histogram_counter_server_query_cache_hit_stddev |             0 |
| histogram_timer_rego_external_resolve_ns_75%    |           625 |
| histogram_timer_rego_external_resolve_ns_90%    |           677 |
| histogram_timer_rego_external_resolve_ns_95%    |           718 |
| histogram_timer_rego_external_resolve_ns_99%    |           938 |
| histogram_timer_rego_external_resolve_ns_99.9%  |          1341 |
| histogram_timer_rego_external_resolve_ns_99.99% |          1341 |
| histogram_timer_rego_external_resolve_ns_count  |           402 |
| histogram_timer_rego_external_resolve_ns_max    |          1341 |
| histogram_timer_rego_external_resolve_ns_mean   |           472 |
| histogram_timer_rego_external_resolve_ns_median |           444 |
| histogram_timer_rego_external_resolve_ns_min    |           233 |
| histogram_timer_rego_external_resolve_ns_stddev |           177 |
| histogram_timer_rego_input_parse_ns_75%         |         20524 |
| histogram_timer_rego_input_parse_ns_90%         |         22598 |
| histogram_timer_rego_input_parse_ns_95%         |         24557 |
| histogram_timer_rego_input_parse_ns_99%         |         34541 |
| histogram_timer_rego_input_parse_ns_99.9%       |        133233 |
| histogram_timer_rego_input_parse_ns_99.99%      |        133233 |
| histogram_timer_rego_input_parse_ns_count       |           402 |
| histogram_timer_rego_input_parse_ns_max         |        133233 |
| histogram_timer_rego_input_parse_ns_mean        |         18356 |
| histogram_timer_rego_input_parse_ns_median      |         17087 |
| histogram_timer_rego_input_parse_ns_min         |         10698 |
| histogram_timer_rego_input_parse_ns_stddev      |          7292 |
| histogram_timer_rego_query_eval_ns_75%          |       3048023 |
| histogram_timer_rego_query_eval_ns_90%          |       3260540 |
| histogram_timer_rego_query_eval_ns_95%          |       3591554 |
| histogram_timer_rego_query_eval_ns_99%          |       3809562 |
| histogram_timer_rego_query_eval_ns_99.9%        |       4034902 |
| histogram_timer_rego_query_eval_ns_99.99%       |       4034902 |
| histogram_timer_rego_query_eval_ns_count        |           402 |
| histogram_timer_rego_query_eval_ns_max          |       4034902 |
| histogram_timer_rego_query_eval_ns_mean         |       2557558 |
| histogram_timer_rego_query_eval_ns_median       |       2573104 |
| histogram_timer_rego_query_eval_ns_min          |       1803723 |
| histogram_timer_rego_query_eval_ns_stddev       |        571422 |
| histogram_timer_server_handler_ns_75%           |       3083889 |
| histogram_timer_server_handler_ns_90%           |       3303397 |
| histogram_timer_server_handler_ns_95%           |       3625538 |
| histogram_timer_server_handler_ns_99%           |       3846051 |
| histogram_timer_server_handler_ns_99.9%         |       4074388 |
| histogram_timer_server_handler_ns_99.99%        |       4074388 |
| histogram_timer_server_handler_ns_count         |           402 |
| histogram_timer_server_handler_ns_max           |       4074388 |
| histogram_timer_server_handler_ns_mean          |       2588862 |
| histogram_timer_server_handler_ns_median        |       2597633 |
| histogram_timer_server_handler_ns_min           |       1830482 |
| histogram_timer_server_handler_ns_stddev        |        573035 |
+-------------------------------------------------+---------------+
+-------------------------------------------------+---------------+
| samples                                         |           420 |
| ns/op                                           |       2910379 |
| B/op                                            |       1658770 |
| allocs/op                                       |         36657 |
| histogram_counter_server_query_cache_hit_75%    |          1.00 |
| histogram_counter_server_query_cache_hit_90%    |          1.00 |
| histogram_counter_server_query_cache_hit_95%    |          1.00 |
| histogram_counter_server_query_cache_hit_99%    |          1.00 |
| histogram_counter_server_query_cache_hit_99.9%  |          1.00 |
| histogram_counter_server_query_cache_hit_99.99% |          1.00 |
| histogram_counter_server_query_cache_hit_count  |           420 |
| histogram_counter_server_query_cache_hit_max    |          1.00 |
| histogram_counter_server_query_cache_hit_mean   |          1.00 |
| histogram_counter_server_query_cache_hit_median |          1.00 |
| histogram_counter_server_query_cache_hit_min    |          1.00 |
| histogram_counter_server_query_cache_hit_stddev |             0 |
| histogram_timer_rego_external_resolve_ns_75%    |           625 |
| histogram_timer_rego_external_resolve_ns_90%    |           671 |
| histogram_timer_rego_external_resolve_ns_95%    |           727 |
| histogram_timer_rego_external_resolve_ns_99%    |           902 |
| histogram_timer_rego_external_resolve_ns_99.9%  |          4314 |
| histogram_timer_rego_external_resolve_ns_99.99% |          4314 |
| histogram_timer_rego_external_resolve_ns_count  |           420 |
| histogram_timer_rego_external_resolve_ns_max    |          4314 |
| histogram_timer_rego_external_resolve_ns_mean   |           486 |
| histogram_timer_rego_external_resolve_ns_median |           520 |
| histogram_timer_rego_external_resolve_ns_min    |           230 |
| histogram_timer_rego_external_resolve_ns_stddev |           253 |
| histogram_timer_rego_input_parse_ns_75%         |         20325 |
| histogram_timer_rego_input_parse_ns_90%         |         21867 |
| histogram_timer_rego_input_parse_ns_95%         |         23886 |
| histogram_timer_rego_input_parse_ns_99%         |         41086 |
| histogram_timer_rego_input_parse_ns_99.9%       |        144677 |
| histogram_timer_rego_input_parse_ns_99.99%      |        144677 |
| histogram_timer_rego_input_parse_ns_count       |           420 |
| histogram_timer_rego_input_parse_ns_max         |        144677 |
| histogram_timer_rego_input_parse_ns_mean        |         18511 |
| histogram_timer_rego_input_parse_ns_median      |         17295 |
| histogram_timer_rego_input_parse_ns_min         |         10516 |
| histogram_timer_rego_input_parse_ns_stddev      |          8110 |
| histogram_timer_rego_query_eval_ns_75%          |       3067622 |
| histogram_timer_rego_query_eval_ns_90%          |       3327819 |
| histogram_timer_rego_query_eval_ns_95%          |       3679708 |
| histogram_timer_rego_query_eval_ns_99%          |       3888583 |
| histogram_timer_rego_query_eval_ns_99.9%        |       4008065 |
| histogram_timer_rego_query_eval_ns_99.99%       |       4008065 |
| histogram_timer_rego_query_eval_ns_count        |           420 |
| histogram_timer_rego_query_eval_ns_max          |       4008065 |
| histogram_timer_rego_query_eval_ns_mean         |       2602465 |
| histogram_timer_rego_query_eval_ns_median       |       2618018 |
| histogram_timer_rego_query_eval_ns_min          |       1775956 |
| histogram_timer_rego_query_eval_ns_stddev       |        587098 |
| histogram_timer_server_handler_ns_75%           |       3104753 |
| histogram_timer_server_handler_ns_90%           |       3360686 |
| histogram_timer_server_handler_ns_95%           |       3714985 |
| histogram_timer_server_handler_ns_99%           |       3919450 |
| histogram_timer_server_handler_ns_99.9%         |       4045965 |
| histogram_timer_server_handler_ns_99.99%        |       4045965 |
| histogram_timer_server_handler_ns_count         |           420 |
| histogram_timer_server_handler_ns_max           |       4045965 |
| histogram_timer_server_handler_ns_mean          |       2633835 |
| histogram_timer_server_handler_ns_median        |       2652558 |
| histogram_timer_server_handler_ns_min           |       1802664 |
| histogram_timer_server_handler_ns_stddev        |        589672 |
+-------------------------------------------------+---------------+
+-------------------------------------------------+---------------+
| samples                                         |           415 |
| ns/op                                           |       2939970 |
| B/op                                            |       1654405 |
| allocs/op                                       |         36664 |
| histogram_counter_server_query_cache_hit_75%    |          1.00 |
| histogram_counter_server_query_cache_hit_90%    |          1.00 |
| histogram_counter_server_query_cache_hit_95%    |          1.00 |
| histogram_counter_server_query_cache_hit_99%    |          1.00 |
| histogram_counter_server_query_cache_hit_99.9%  |          1.00 |
| histogram_counter_server_query_cache_hit_99.99% |          1.00 |
| histogram_counter_server_query_cache_hit_count  |           415 |
| histogram_counter_server_query_cache_hit_max    |          1.00 |
| histogram_counter_server_query_cache_hit_mean   |          1.00 |
| histogram_counter_server_query_cache_hit_median |          1.00 |
| histogram_counter_server_query_cache_hit_min    |          1.00 |
| histogram_counter_server_query_cache_hit_stddev |             0 |
| histogram_timer_rego_external_resolve_ns_75%    |           623 |
| histogram_timer_rego_external_resolve_ns_90%    |           657 |
| histogram_timer_rego_external_resolve_ns_95%    |           683 |
| histogram_timer_rego_external_resolve_ns_99%    |           756 |
| histogram_timer_rego_external_resolve_ns_99.9%  |           855 |
| histogram_timer_rego_external_resolve_ns_99.99% |           855 |
| histogram_timer_rego_external_resolve_ns_count  |           415 |
| histogram_timer_rego_external_resolve_ns_max    |           855 |
| histogram_timer_rego_external_resolve_ns_mean   |           475 |
| histogram_timer_rego_external_resolve_ns_median |           539 |
| histogram_timer_rego_external_resolve_ns_min    |           225 |
| histogram_timer_rego_external_resolve_ns_stddev |           161 |
| histogram_timer_rego_input_parse_ns_75%         |         19963 |
| histogram_timer_rego_input_parse_ns_90%         |         21491 |
| histogram_timer_rego_input_parse_ns_95%         |         22854 |
| histogram_timer_rego_input_parse_ns_99%         |         50806 |
| histogram_timer_rego_input_parse_ns_99.9%       |        197328 |
| histogram_timer_rego_input_parse_ns_99.99%      |        197328 |
| histogram_timer_rego_input_parse_ns_count       |           415 |
| histogram_timer_rego_input_parse_ns_max         |        197328 |
| histogram_timer_rego_input_parse_ns_mean        |         18482 |
| histogram_timer_rego_input_parse_ns_median      |         16931 |
| histogram_timer_rego_input_parse_ns_min         |          9962 |
| histogram_timer_rego_input_parse_ns_stddev      |         11776 |
| histogram_timer_rego_query_eval_ns_75%          |       3094584 |
| histogram_timer_rego_query_eval_ns_90%          |       3430620 |
| histogram_timer_rego_query_eval_ns_95%          |       3658790 |
| histogram_timer_rego_query_eval_ns_99%          |       3824311 |
| histogram_timer_rego_query_eval_ns_99.9%        |       3965069 |
| histogram_timer_rego_query_eval_ns_99.99%       |       3965069 |
| histogram_timer_rego_query_eval_ns_count        |           415 |
| histogram_timer_rego_query_eval_ns_max          |       3965069 |
| histogram_timer_rego_query_eval_ns_mean         |       2645366 |
| histogram_timer_rego_query_eval_ns_median       |       2650175 |
| histogram_timer_rego_query_eval_ns_min          |       1782037 |
| histogram_timer_rego_query_eval_ns_stddev       |        580150 |
| histogram_timer_server_handler_ns_75%           |       3125294 |
| histogram_timer_server_handler_ns_90%           |       3471657 |
| histogram_timer_server_handler_ns_95%           |       3697771 |
| histogram_timer_server_handler_ns_99%           |       3857537 |
| histogram_timer_server_handler_ns_99.9%         |       4008647 |
| histogram_timer_server_handler_ns_99.99%        |       4008647 |
| histogram_timer_server_handler_ns_count         |           415 |
| histogram_timer_server_handler_ns_max           |       4008647 |
| histogram_timer_server_handler_ns_mean          |       2676551 |
| histogram_timer_server_handler_ns_median        |       2676505 |
| histogram_timer_server_handler_ns_min           |       1801247 |
| histogram_timer_server_handler_ns_stddev        |        582735 |
+-------------------------------------------------+---------------+
+-------------------------------------------------+---------------+
| samples                                         |           429 |
| ns/op                                           |       2875957 |
| B/op                                            |       1654103 |
| allocs/op                                       |         36646 |
| histogram_counter_server_query_cache_hit_75%    |          1.00 |
| histogram_counter_server_query_cache_hit_90%    |          1.00 |
| histogram_counter_server_query_cache_hit_95%    |          1.00 |
| histogram_counter_server_query_cache_hit_99%    |          1.00 |
| histogram_counter_server_query_cache_hit_99.9%  |          1.00 |
| histogram_counter_server_query_cache_hit_99.99% |          1.00 |
| histogram_counter_server_query_cache_hit_count  |           429 |
| histogram_counter_server_query_cache_hit_max    |          1.00 |
| histogram_counter_server_query_cache_hit_mean   |          1.00 |
| histogram_counter_server_query_cache_hit_median |          1.00 |
| histogram_counter_server_query_cache_hit_min    |          1.00 |
| histogram_counter_server_query_cache_hit_stddev |             0 |
| histogram_timer_rego_external_resolve_ns_75%    |           620 |
| histogram_timer_rego_external_resolve_ns_90%    |           664 |
| histogram_timer_rego_external_resolve_ns_95%    |           702 |
| histogram_timer_rego_external_resolve_ns_99%    |           772 |
| histogram_timer_rego_external_resolve_ns_99.9%  |          1560 |
| histogram_timer_rego_external_resolve_ns_99.99% |          1560 |
| histogram_timer_rego_external_resolve_ns_count  |           429 |
| histogram_timer_rego_external_resolve_ns_max    |          1560 |
| histogram_timer_rego_external_resolve_ns_mean   |           478 |
| histogram_timer_rego_external_resolve_ns_median |           517 |
| histogram_timer_rego_external_resolve_ns_min    |           195 |
| histogram_timer_rego_external_resolve_ns_stddev |           170 |
| histogram_timer_rego_input_parse_ns_75%         |         20293 |
| histogram_timer_rego_input_parse_ns_90%         |         21773 |
| histogram_timer_rego_input_parse_ns_95%         |         24100 |
| histogram_timer_rego_input_parse_ns_99%         |         38639 |
| histogram_timer_rego_input_parse_ns_99.9%       |        121910 |
| histogram_timer_rego_input_parse_ns_99.99%      |        121910 |
| histogram_timer_rego_input_parse_ns_count       |           429 |
| histogram_timer_rego_input_parse_ns_max         |        121910 |
| histogram_timer_rego_input_parse_ns_mean        |         18297 |
| histogram_timer_rego_input_parse_ns_median      |         17081 |
| histogram_timer_rego_input_parse_ns_min         |         10089 |
| histogram_timer_rego_input_parse_ns_stddev      |          7347 |
| histogram_timer_rego_query_eval_ns_75%          |       3066166 |
| histogram_timer_rego_query_eval_ns_90%          |       3348785 |
| histogram_timer_rego_query_eval_ns_95%          |       3563977 |
| histogram_timer_rego_query_eval_ns_99%          |       3819867 |
| histogram_timer_rego_query_eval_ns_99.9%        |       3922384 |
| histogram_timer_rego_query_eval_ns_99.99%       |       3922384 |
| histogram_timer_rego_query_eval_ns_count        |           429 |
| histogram_timer_rego_query_eval_ns_max          |       3922384 |
| histogram_timer_rego_query_eval_ns_mean         |       2583534 |
| histogram_timer_rego_query_eval_ns_median       |       2603474 |
| histogram_timer_rego_query_eval_ns_min          |       1808112 |
| histogram_timer_rego_query_eval_ns_stddev       |        575804 |
| histogram_timer_server_handler_ns_75%           |       3100334 |
| histogram_timer_server_handler_ns_90%           |       3383843 |
| histogram_timer_server_handler_ns_95%           |       3596007 |
| histogram_timer_server_handler_ns_99%           |       3852433 |
| histogram_timer_server_handler_ns_99.9%         |       3949917 |
| histogram_timer_server_handler_ns_99.99%        |       3949917 |
| histogram_timer_server_handler_ns_count         |           429 |
| histogram_timer_server_handler_ns_max           |       3949917 |
| histogram_timer_server_handler_ns_mean          |       2615355 |
| histogram_timer_server_handler_ns_median        |       2630804 |
| histogram_timer_server_handler_ns_min           |       1833029 |
| histogram_timer_server_handler_ns_stddev        |        578109 |
+-------------------------------------------------+---------------+
+-------------------------------------------------+---------------+
| samples                                         |           410 |
| ns/op                                           |       2882429 |
| B/op                                            |       1655195 |
| allocs/op                                       |         36661 |
| histogram_counter_server_query_cache_hit_75%    |          1.00 |
| histogram_counter_server_query_cache_hit_90%    |          1.00 |
| histogram_counter_server_query_cache_hit_95%    |          1.00 |
| histogram_counter_server_query_cache_hit_99%    |          1.00 |
| histogram_counter_server_query_cache_hit_99.9%  |          1.00 |
| histogram_counter_server_query_cache_hit_99.99% |          1.00 |
| histogram_counter_server_query_cache_hit_count  |           410 |
| histogram_counter_server_query_cache_hit_max    |          1.00 |
| histogram_counter_server_query_cache_hit_mean   |          1.00 |
| histogram_counter_server_query_cache_hit_median |          1.00 |
| histogram_counter_server_query_cache_hit_min    |          1.00 |
| histogram_counter_server_query_cache_hit_stddev |             0 |
| histogram_timer_rego_external_resolve_ns_75%    |           621 |
| histogram_timer_rego_external_resolve_ns_90%    |           670 |
| histogram_timer_rego_external_resolve_ns_95%    |           722 |
| histogram_timer_rego_external_resolve_ns_99%    |           885 |
| histogram_timer_rego_external_resolve_ns_99.9%  |          1421 |
| histogram_timer_rego_external_resolve_ns_99.99% |          1421 |
| histogram_timer_rego_external_resolve_ns_count  |           410 |
| histogram_timer_rego_external_resolve_ns_max    |          1421 |
| histogram_timer_rego_external_resolve_ns_mean   |           473 |
| histogram_timer_rego_external_resolve_ns_median |           440 |
| histogram_timer_rego_external_resolve_ns_min    |           233 |
| histogram_timer_rego_external_resolve_ns_stddev |           175 |
| histogram_timer_rego_input_parse_ns_75%         |         19913 |
| histogram_timer_rego_input_parse_ns_90%         |         21532 |
| histogram_timer_rego_input_parse_ns_95%         |         22720 |
| histogram_timer_rego_input_parse_ns_99%         |         31491 |
| histogram_timer_rego_input_parse_ns_99.9%       |         99612 |
| histogram_timer_rego_input_parse_ns_99.99%      |         99612 |
| histogram_timer_rego_input_parse_ns_count       |           410 |
| histogram_timer_rego_input_parse_ns_max         |         99612 |
| histogram_timer_rego_input_parse_ns_mean        |         17843 |
| histogram_timer_rego_input_parse_ns_median      |         16975 |
| histogram_timer_rego_input_parse_ns_min         |         10947 |
| histogram_timer_rego_input_parse_ns_stddev      |          5298 |
| histogram_timer_rego_query_eval_ns_75%          |       3065998 |
| histogram_timer_rego_query_eval_ns_90%          |       3292961 |
| histogram_timer_rego_query_eval_ns_95%          |       3616937 |
| histogram_timer_rego_query_eval_ns_99%          |       3875452 |
| histogram_timer_rego_query_eval_ns_99.9%        |       4042379 |
| histogram_timer_rego_query_eval_ns_99.99%       |       4042379 |
| histogram_timer_rego_query_eval_ns_count        |           410 |
| histogram_timer_rego_query_eval_ns_max          |       4042379 |
| histogram_timer_rego_query_eval_ns_mean         |       2579309 |
| histogram_timer_rego_query_eval_ns_median       |       2589479 |
| histogram_timer_rego_query_eval_ns_min          |       1799927 |
| histogram_timer_rego_query_eval_ns_stddev       |        573780 |
| histogram_timer_server_handler_ns_75%           |       3099300 |
| histogram_timer_server_handler_ns_90%           |       3348681 |
| histogram_timer_server_handler_ns_95%           |       3653414 |
| histogram_timer_server_handler_ns_99%           |       3913699 |
| histogram_timer_server_handler_ns_99.9%         |       4090481 |
| histogram_timer_server_handler_ns_99.99%        |       4090481 |
| histogram_timer_server_handler_ns_count         |           410 |
| histogram_timer_server_handler_ns_max           |       4090481 |
| histogram_timer_server_handler_ns_mean          |       2610099 |
| histogram_timer_server_handler_ns_median        |       2619667 |
| histogram_timer_server_handler_ns_min           |       1825626 |
| histogram_timer_server_handler_ns_stddev        |        576384 |
+-------------------------------------------------+---------------+

Modified Version:

+-------------------------------------------------+---------------+
| samples                                         |           594 |
| ns/op                                           |       2018369 |
| B/op                                            |       1120979 |
| allocs/op                                       |         26196 |
| histogram_counter_server_query_cache_hit_75%    |          1.00 |
| histogram_counter_server_query_cache_hit_90%    |          1.00 |
| histogram_counter_server_query_cache_hit_95%    |          1.00 |
| histogram_counter_server_query_cache_hit_99%    |          1.00 |
| histogram_counter_server_query_cache_hit_99.9%  |          1.00 |
| histogram_counter_server_query_cache_hit_99.99% |          1.00 |
| histogram_counter_server_query_cache_hit_count  |           594 |
| histogram_counter_server_query_cache_hit_max    |          1.00 |
| histogram_counter_server_query_cache_hit_mean   |          1.00 |
| histogram_counter_server_query_cache_hit_median |          1.00 |
| histogram_counter_server_query_cache_hit_min    |          1.00 |
| histogram_counter_server_query_cache_hit_stddev |             0 |
| histogram_timer_rego_external_resolve_ns_75%    |           594 |
| histogram_timer_rego_external_resolve_ns_90%    |           644 |
| histogram_timer_rego_external_resolve_ns_95%    |           694 |
| histogram_timer_rego_external_resolve_ns_99%    |           796 |
| histogram_timer_rego_external_resolve_ns_99.9%  |          3050 |
| histogram_timer_rego_external_resolve_ns_99.99% |          3050 |
| histogram_timer_rego_external_resolve_ns_count  |           594 |
| histogram_timer_rego_external_resolve_ns_max    |          3050 |
| histogram_timer_rego_external_resolve_ns_mean   |           459 |
| histogram_timer_rego_external_resolve_ns_median |           495 |
| histogram_timer_rego_external_resolve_ns_min    |           196 |
| histogram_timer_rego_external_resolve_ns_stddev |           194 |
| histogram_timer_rego_input_parse_ns_75%         |         20329 |
| histogram_timer_rego_input_parse_ns_90%         |         21812 |
| histogram_timer_rego_input_parse_ns_95%         |         23388 |
| histogram_timer_rego_input_parse_ns_99%         |         32407 |
| histogram_timer_rego_input_parse_ns_99.9%       |        145243 |
| histogram_timer_rego_input_parse_ns_99.99%      |        145243 |
| histogram_timer_rego_input_parse_ns_count       |           594 |
| histogram_timer_rego_input_parse_ns_max         |        145243 |
| histogram_timer_rego_input_parse_ns_mean        |         18119 |
| histogram_timer_rego_input_parse_ns_median      |         17562 |
| histogram_timer_rego_input_parse_ns_min         |          9999 |
| histogram_timer_rego_input_parse_ns_stddev      |          6465 |
| histogram_timer_rego_query_eval_ns_75%          |       2023361 |
| histogram_timer_rego_query_eval_ns_90%          |       2255516 |
| histogram_timer_rego_query_eval_ns_95%          |       2549873 |
| histogram_timer_rego_query_eval_ns_99%          |       2736764 |
| histogram_timer_rego_query_eval_ns_99.9%        |       2960154 |
| histogram_timer_rego_query_eval_ns_99.99%       |       2960154 |
| histogram_timer_rego_query_eval_ns_count        |           594 |
| histogram_timer_rego_query_eval_ns_max          |       2960154 |
| histogram_timer_rego_query_eval_ns_mean         |       1741311 |
| histogram_timer_rego_query_eval_ns_median       |       1885378 |
| histogram_timer_rego_query_eval_ns_min          |       1172187 |
| histogram_timer_rego_query_eval_ns_stddev       |        433651 |
| histogram_timer_server_handler_ns_75%           |       2057270 |
| histogram_timer_server_handler_ns_90%           |       2289156 |
| histogram_timer_server_handler_ns_95%           |       2576288 |
| histogram_timer_server_handler_ns_99%           |       2779839 |
| histogram_timer_server_handler_ns_99.9%         |       2990860 |
| histogram_timer_server_handler_ns_99.99%        |       2990860 |
| histogram_timer_server_handler_ns_count         |           594 |
| histogram_timer_server_handler_ns_max           |       2990860 |
| histogram_timer_server_handler_ns_mean          |       1772765 |
| histogram_timer_server_handler_ns_median        |       1912682 |
| histogram_timer_server_handler_ns_min           |       1191010 |
| histogram_timer_server_handler_ns_stddev        |        436361 |
+-------------------------------------------------+---------------+
+-------------------------------------------------+---------------+
| samples                                         |           590 |
| ns/op                                           |       1999042 |
| B/op                                            |       1121766 |
| allocs/op                                       |         26196 |
| histogram_counter_server_query_cache_hit_75%    |          1.00 |
| histogram_counter_server_query_cache_hit_90%    |          1.00 |
| histogram_counter_server_query_cache_hit_95%    |          1.00 |
| histogram_counter_server_query_cache_hit_99%    |          1.00 |
| histogram_counter_server_query_cache_hit_99.9%  |          1.00 |
| histogram_counter_server_query_cache_hit_99.99% |          1.00 |
| histogram_counter_server_query_cache_hit_count  |           590 |
| histogram_counter_server_query_cache_hit_max    |          1.00 |
| histogram_counter_server_query_cache_hit_mean   |          1.00 |
| histogram_counter_server_query_cache_hit_median |          1.00 |
| histogram_counter_server_query_cache_hit_min    |          1.00 |
| histogram_counter_server_query_cache_hit_stddev |             0 |
| histogram_timer_rego_external_resolve_ns_75%    |           595 |
| histogram_timer_rego_external_resolve_ns_90%    |           652 |
| histogram_timer_rego_external_resolve_ns_95%    |           698 |
| histogram_timer_rego_external_resolve_ns_99%    |           808 |
| histogram_timer_rego_external_resolve_ns_99.9%  |          2767 |
| histogram_timer_rego_external_resolve_ns_99.99% |          2767 |
| histogram_timer_rego_external_resolve_ns_count  |           590 |
| histogram_timer_rego_external_resolve_ns_max    |          2767 |
| histogram_timer_rego_external_resolve_ns_mean   |           464 |
| histogram_timer_rego_external_resolve_ns_median |           493 |
| histogram_timer_rego_external_resolve_ns_min    |           172 |
| histogram_timer_rego_external_resolve_ns_stddev |           211 |
| histogram_timer_rego_input_parse_ns_75%         |         20162 |
| histogram_timer_rego_input_parse_ns_90%         |         22089 |
| histogram_timer_rego_input_parse_ns_95%         |         23815 |
| histogram_timer_rego_input_parse_ns_99%         |         30522 |
| histogram_timer_rego_input_parse_ns_99.9%       |        145420 |
| histogram_timer_rego_input_parse_ns_99.99%      |        145420 |
| histogram_timer_rego_input_parse_ns_count       |           590 |
| histogram_timer_rego_input_parse_ns_max         |        145420 |
| histogram_timer_rego_input_parse_ns_mean        |         18216 |
| histogram_timer_rego_input_parse_ns_median      |         17216 |
| histogram_timer_rego_input_parse_ns_min         |         10377 |
| histogram_timer_rego_input_parse_ns_stddev      |          7632 |
| histogram_timer_rego_query_eval_ns_75%          |       2028536 |
| histogram_timer_rego_query_eval_ns_90%          |       2268910 |
| histogram_timer_rego_query_eval_ns_95%          |       2596479 |
| histogram_timer_rego_query_eval_ns_99%          |       2727354 |
| histogram_timer_rego_query_eval_ns_99.9%        |       2909117 |
| histogram_timer_rego_query_eval_ns_99.99%       |       2909117 |
| histogram_timer_rego_query_eval_ns_count        |           590 |
| histogram_timer_rego_query_eval_ns_max          |       2909117 |
| histogram_timer_rego_query_eval_ns_mean         |       1730977 |
| histogram_timer_rego_query_eval_ns_median       |       1769727 |
| histogram_timer_rego_query_eval_ns_min          |       1184764 |
| histogram_timer_rego_query_eval_ns_stddev       |        438296 |
| histogram_timer_server_handler_ns_75%           |       2065912 |
| histogram_timer_server_handler_ns_90%           |       2302219 |
| histogram_timer_server_handler_ns_95%           |       2623837 |
| histogram_timer_server_handler_ns_99%           |       2802402 |
| histogram_timer_server_handler_ns_99.9%         |       2942897 |
| histogram_timer_server_handler_ns_99.99%        |       2942897 |
| histogram_timer_server_handler_ns_count         |           590 |
| histogram_timer_server_handler_ns_max           |       2942897 |
| histogram_timer_server_handler_ns_mean          |       1763092 |
| histogram_timer_server_handler_ns_median        |       1797988 |
| histogram_timer_server_handler_ns_min           |       1205768 |
| histogram_timer_server_handler_ns_stddev        |        441766 |
+-------------------------------------------------+---------------+
+-------------------------------------------------+---------------+
| samples                                         |           595 |
| ns/op                                           |       2024177 |
| B/op                                            |       1122398 |
| allocs/op                                       |         26196 |
| histogram_counter_server_query_cache_hit_75%    |          1.00 |
| histogram_counter_server_query_cache_hit_90%    |          1.00 |
| histogram_counter_server_query_cache_hit_95%    |          1.00 |
| histogram_counter_server_query_cache_hit_99%    |          1.00 |
| histogram_counter_server_query_cache_hit_99.9%  |          1.00 |
| histogram_counter_server_query_cache_hit_99.99% |          1.00 |
| histogram_counter_server_query_cache_hit_count  |           595 |
| histogram_counter_server_query_cache_hit_max    |          1.00 |
| histogram_counter_server_query_cache_hit_mean   |          1.00 |
| histogram_counter_server_query_cache_hit_median |          1.00 |
| histogram_counter_server_query_cache_hit_min    |          1.00 |
| histogram_counter_server_query_cache_hit_stddev |             0 |
| histogram_timer_rego_external_resolve_ns_75%    |           599 |
| histogram_timer_rego_external_resolve_ns_90%    |           652 |
| histogram_timer_rego_external_resolve_ns_95%    |           706 |
| histogram_timer_rego_external_resolve_ns_99%    |           875 |
| histogram_timer_rego_external_resolve_ns_99.9%  |          1717 |
| histogram_timer_rego_external_resolve_ns_99.99% |          1717 |
| histogram_timer_rego_external_resolve_ns_count  |           595 |
| histogram_timer_rego_external_resolve_ns_max    |          1717 |
| histogram_timer_rego_external_resolve_ns_mean   |           469 |
| histogram_timer_rego_external_resolve_ns_median |           496 |
| histogram_timer_rego_external_resolve_ns_min    |           174 |
| histogram_timer_rego_external_resolve_ns_stddev |           174 |
| histogram_timer_rego_input_parse_ns_75%         |         20360 |
| histogram_timer_rego_input_parse_ns_90%         |         21883 |
| histogram_timer_rego_input_parse_ns_95%         |         23111 |
| histogram_timer_rego_input_parse_ns_99%         |         29452 |
| histogram_timer_rego_input_parse_ns_99.9%       |         64900 |
| histogram_timer_rego_input_parse_ns_99.99%      |         64900 |
| histogram_timer_rego_input_parse_ns_count       |           595 |
| histogram_timer_rego_input_parse_ns_max         |         64900 |
| histogram_timer_rego_input_parse_ns_mean        |         17956 |
| histogram_timer_rego_input_parse_ns_median      |         17385 |
| histogram_timer_rego_input_parse_ns_min         |          9965 |
| histogram_timer_rego_input_parse_ns_stddev      |          4338 |
| histogram_timer_rego_query_eval_ns_75%          |       2030123 |
| histogram_timer_rego_query_eval_ns_90%          |       2265249 |
| histogram_timer_rego_query_eval_ns_95%          |       2538084 |
| histogram_timer_rego_query_eval_ns_99%          |       2768032 |
| histogram_timer_rego_query_eval_ns_99.9%        |       2970319 |
| histogram_timer_rego_query_eval_ns_99.99%       |       2970319 |
| histogram_timer_rego_query_eval_ns_count        |           595 |
| histogram_timer_rego_query_eval_ns_max          |       2970319 |
| histogram_timer_rego_query_eval_ns_mean         |       1745604 |
| histogram_timer_rego_query_eval_ns_median       |       1866227 |
| histogram_timer_rego_query_eval_ns_min          |       1171194 |
| histogram_timer_rego_query_eval_ns_stddev       |        430410 |
| histogram_timer_server_handler_ns_75%           |       2066952 |
| histogram_timer_server_handler_ns_90%           |       2300031 |
| histogram_timer_server_handler_ns_95%           |       2566799 |
| histogram_timer_server_handler_ns_99%           |       2807388 |
| histogram_timer_server_handler_ns_99.9%         |       3007238 |
| histogram_timer_server_handler_ns_99.99%        |       3007238 |
| histogram_timer_server_handler_ns_count         |           595 |
| histogram_timer_server_handler_ns_max           |       3007238 |
| histogram_timer_server_handler_ns_mean          |       1777134 |
| histogram_timer_server_handler_ns_median        |       1894226 |
| histogram_timer_server_handler_ns_min           |       1191160 |
| histogram_timer_server_handler_ns_stddev        |        432844 |
+-------------------------------------------------+---------------+
+-------------------------------------------------+---------------+
| samples                                         |           572 |
| ns/op                                           |       1996541 |
| B/op                                            |       1119163 |
| allocs/op                                       |         26195 |
| histogram_counter_server_query_cache_hit_75%    |          1.00 |
| histogram_counter_server_query_cache_hit_90%    |          1.00 |
| histogram_counter_server_query_cache_hit_95%    |          1.00 |
| histogram_counter_server_query_cache_hit_99%    |          1.00 |
| histogram_counter_server_query_cache_hit_99.9%  |          1.00 |
| histogram_counter_server_query_cache_hit_99.99% |          1.00 |
| histogram_counter_server_query_cache_hit_count  |           572 |
| histogram_counter_server_query_cache_hit_max    |          1.00 |
| histogram_counter_server_query_cache_hit_mean   |          1.00 |
| histogram_counter_server_query_cache_hit_median |          1.00 |
| histogram_counter_server_query_cache_hit_min    |          1.00 |
| histogram_counter_server_query_cache_hit_stddev |             0 |
| histogram_timer_rego_external_resolve_ns_75%    |           598 |
| histogram_timer_rego_external_resolve_ns_90%    |           646 |
| histogram_timer_rego_external_resolve_ns_95%    |           690 |
| histogram_timer_rego_external_resolve_ns_99%    |           869 |
| histogram_timer_rego_external_resolve_ns_99.9%  |          6022 |
| histogram_timer_rego_external_resolve_ns_99.99% |          6022 |
| histogram_timer_rego_external_resolve_ns_count  |           572 |
| histogram_timer_rego_external_resolve_ns_max    |          6022 |
| histogram_timer_rego_external_resolve_ns_mean   |           473 |
| histogram_timer_rego_external_resolve_ns_median |           479 |
| histogram_timer_rego_external_resolve_ns_min    |           203 |
| histogram_timer_rego_external_resolve_ns_stddev |           316 |
| histogram_timer_rego_input_parse_ns_75%         |         20156 |
| histogram_timer_rego_input_parse_ns_90%         |         22035 |
| histogram_timer_rego_input_parse_ns_95%         |         23648 |
| histogram_timer_rego_input_parse_ns_99%         |         33235 |
| histogram_timer_rego_input_parse_ns_99.9%       |         55003 |
| histogram_timer_rego_input_parse_ns_99.99%      |         55003 |
| histogram_timer_rego_input_parse_ns_count       |           572 |
| histogram_timer_rego_input_parse_ns_max         |         55003 |
| histogram_timer_rego_input_parse_ns_mean        |         17927 |
| histogram_timer_rego_input_parse_ns_median      |         17156 |
| histogram_timer_rego_input_parse_ns_min         |         10005 |
| histogram_timer_rego_input_parse_ns_stddev      |          4419 |
| histogram_timer_rego_query_eval_ns_75%          |       2029172 |
| histogram_timer_rego_query_eval_ns_90%          |       2295058 |
| histogram_timer_rego_query_eval_ns_95%          |       2569377 |
| histogram_timer_rego_query_eval_ns_99%          |       2794592 |
| histogram_timer_rego_query_eval_ns_99.9%        |       2896681 |
| histogram_timer_rego_query_eval_ns_99.99%       |       2896681 |
| histogram_timer_rego_query_eval_ns_count        |           572 |
| histogram_timer_rego_query_eval_ns_max          |       2896681 |
| histogram_timer_rego_query_eval_ns_mean         |       1722685 |
| histogram_timer_rego_query_eval_ns_median       |       1712026 |
| histogram_timer_rego_query_eval_ns_min          |       1181217 |
| histogram_timer_rego_query_eval_ns_stddev       |        446376 |
| histogram_timer_server_handler_ns_75%           |       2064290 |
| histogram_timer_server_handler_ns_90%           |       2335003 |
| histogram_timer_server_handler_ns_95%           |       2602541 |
| histogram_timer_server_handler_ns_99%           |       2830435 |
| histogram_timer_server_handler_ns_99.9%         |       2934639 |
| histogram_timer_server_handler_ns_99.99%        |       2934639 |
| histogram_timer_server_handler_ns_count         |           572 |
| histogram_timer_server_handler_ns_max           |       2934639 |
| histogram_timer_server_handler_ns_mean          |       1753799 |
| histogram_timer_server_handler_ns_median        |       1753756 |
| histogram_timer_server_handler_ns_min           |       1202059 |
| histogram_timer_server_handler_ns_stddev        |        448541 |
+-------------------------------------------------+---------------+
+-------------------------------------------------+---------------+
| samples                                         |           588 |
| ns/op                                           |       2004430 |
| B/op                                            |       1120667 |
| allocs/op                                       |         26195 |
| histogram_counter_server_query_cache_hit_75%    |          1.00 |
| histogram_counter_server_query_cache_hit_90%    |          1.00 |
| histogram_counter_server_query_cache_hit_95%    |          1.00 |
| histogram_counter_server_query_cache_hit_99%    |          1.00 |
| histogram_counter_server_query_cache_hit_99.9%  |          1.00 |
| histogram_counter_server_query_cache_hit_99.99% |          1.00 |
| histogram_counter_server_query_cache_hit_count  |           588 |
| histogram_counter_server_query_cache_hit_max    |          1.00 |
| histogram_counter_server_query_cache_hit_mean   |          1.00 |
| histogram_counter_server_query_cache_hit_median |          1.00 |
| histogram_counter_server_query_cache_hit_min    |          1.00 |
| histogram_counter_server_query_cache_hit_stddev |             0 |
| histogram_timer_rego_external_resolve_ns_75%    |           594 |
| histogram_timer_rego_external_resolve_ns_90%    |           643 |
| histogram_timer_rego_external_resolve_ns_95%    |           682 |
| histogram_timer_rego_external_resolve_ns_99%    |           779 |
| histogram_timer_rego_external_resolve_ns_99.9%  |           906 |
| histogram_timer_rego_external_resolve_ns_99.99% |           906 |
| histogram_timer_rego_external_resolve_ns_count  |           588 |
| histogram_timer_rego_external_resolve_ns_max    |           906 |
| histogram_timer_rego_external_resolve_ns_mean   |           452 |
| histogram_timer_rego_external_resolve_ns_median |           465 |
| histogram_timer_rego_external_resolve_ns_min    |           171 |
| histogram_timer_rego_external_resolve_ns_stddev |           162 |
| histogram_timer_rego_input_parse_ns_75%         |         20201 |
| histogram_timer_rego_input_parse_ns_90%         |         21756 |
| histogram_timer_rego_input_parse_ns_95%         |         23069 |
| histogram_timer_rego_input_parse_ns_99%         |         42176 |
| histogram_timer_rego_input_parse_ns_99.9%       |        126950 |
| histogram_timer_rego_input_parse_ns_99.99%      |        126950 |
| histogram_timer_rego_input_parse_ns_count       |           588 |
| histogram_timer_rego_input_parse_ns_max         |        126950 |
| histogram_timer_rego_input_parse_ns_mean        |         18279 |
| histogram_timer_rego_input_parse_ns_median      |         17220 |
| histogram_timer_rego_input_parse_ns_min         |         10045 |
| histogram_timer_rego_input_parse_ns_stddev      |          8113 |
| histogram_timer_rego_query_eval_ns_75%          |       2040807 |
| histogram_timer_rego_query_eval_ns_90%          |       2234770 |
| histogram_timer_rego_query_eval_ns_95%          |       2498280 |
| histogram_timer_rego_query_eval_ns_99%          |       2743252 |
| histogram_timer_rego_query_eval_ns_99.9%        |       2925037 |
| histogram_timer_rego_query_eval_ns_99.99%       |       2925037 |
| histogram_timer_rego_query_eval_ns_count        |           588 |
| histogram_timer_rego_query_eval_ns_max          |       2925037 |
| histogram_timer_rego_query_eval_ns_mean         |       1729973 |
| histogram_timer_rego_query_eval_ns_median       |       1779205 |
| histogram_timer_rego_query_eval_ns_min          |       1170691 |
| histogram_timer_rego_query_eval_ns_stddev       |        439014 |
| histogram_timer_server_handler_ns_75%           |       2076536 |
| histogram_timer_server_handler_ns_90%           |       2292913 |
| histogram_timer_server_handler_ns_95%           |       2550020 |
| histogram_timer_server_handler_ns_99%           |       2776740 |
| histogram_timer_server_handler_ns_99.9%         |       2961330 |
| histogram_timer_server_handler_ns_99.99%        |       2961330 |
| histogram_timer_server_handler_ns_count         |           588 |
| histogram_timer_server_handler_ns_max           |       2961330 |
| histogram_timer_server_handler_ns_mean          |       1761249 |
| histogram_timer_server_handler_ns_median        |       1812784 |
| histogram_timer_server_handler_ns_min           |       1190756 |
| histogram_timer_server_handler_ns_stddev        |        441948 |
+-------------------------------------------------+---------------+
anderseknert commented 1 month ago

Thanks for that detailed report! A contribution would be most welcome 😃

ashutosh-narkar commented 1 month ago

I'd be open to work on a PR and a solution I actually have done one on opa-envoy-plugin and would like to contribute to the community:)

Awesome!

Making the cache size configurable may not be ideal. You could do it via the context but does not seem correct.

@johanfylling may know the reason for setting the default to 100 but we could just increase that.

Currently we remove only one value from the cache, may be that can be improved.

amirsalarsafaei commented 1 month ago

@ashutosh-narkar setting the default to any number may not be ideal, because the default rule or bundle static data maybe beyond that, and that can cause the same issue

ashutosh-narkar commented 1 month ago

setting the default to any number may not be ideal, because the default rule or bundle static data maybe beyond that, and that can cause the same issue

Sure. The thing we probably need to understand is what is the norm here. Whatever value is selected there are always going to be some cases where we see performance issues. The question is then if that's an edge case, which is not easy to figure out. Since we cannot make the value configurable, looking into improving caching behavior would be a possible solution. WDYT @johanfylling?

johanfylling commented 1 month ago

Setting the cache size to 100 was made with the assumption that a large set of regex patterns is very uncommon, and that slow performance is a better tradeoff than OOM.

In the original OOM fix, we opted to not use a more sophisticated eviction strategy than to simply remove a random entry from the cache. An option could be to revisit this, and instead evict maybe the least used entry in the cache (just as an example). Anything complicated here that require us to scan or reorder the cache would of course come with it's own performance penalty. But from your issue description, it sounds like the number of patterns might still be too large for this to be a viable solution.

An argument for making the cache size configurable through config-file/env-var would be that whomever is launching OPA is likely also the person/team to know the memory limitations of the host machine.

amirsalarsafaei commented 1 month ago

What about re-using the caching package? It has configurable cache size limit in bytes which is better than a counter.

amirsalarsafaei commented 1 month ago

Hi again, I read a little about glob library, maybe we should change our view of the matter at hand. Library The readme states that “This library is created for compile-once patterns”, if someone needs to match dynamic patterns (based on input or changing policy) they should use regex. Maybe we should warn users of the OOM and set a large number for the cap. To give you some perspective of why 100 is not enough for a glob, imagine having more than 1000 rest endpoints which may or may not contain some * as part of their path due to restful principles. Glob is the obvious choice over regex in this case. The endpoints rarely change without a bundle/policy change.

The other solution that I was thinking about was taking another argument from the user in the function glob.match called persist, it’s disabled by default but if the user enables it the compiled glob will always get persisted in another map no matter what, without a cap! This brings back the OOM problem ,but with good naming and documentation, we can mitigate it.

@johanfylling

johanfylling commented 1 month ago

Raising the glob cache size to somewhere in the thousands should be benign. What would be a suitable limit for your use case @amirsalarsafaei?

We still might want this to be configurable, though 🤔.

amirsalarsafaei commented 1 month ago

@johanfylling I actually took a shortcut by creating a template file and replacing glob match with some thing like this

input.parsed_path == ["v1", "users", input.parsed_path[2]]

input.parsed_path is path split by /, however being involved with glob.match makes me think that this change isn’t right. There should be a difference between regex and glob. Glob is slower at compilation so setting a 100 cap would hold it back. But I think setting the cap to 1000 or 5000 would be a good start but wouldn’t address the issue completely.

johanfylling commented 3 weeks ago

@amirsalarsafaei, for paths that don't require mid-point ** globs, I think your solution looks very idiomatic for Regal :).

johanfylling commented 3 weeks ago

Raising the cache limit too far would bring us back to the memory issue previously solved. I think we should aim for a solution that accommodates most scenarios without being destructive. OOM being a halting issue, and cache overflow "just" being a performance issue, I'm inclined to favor the OOM issue.

I suggest we move the glob cache to the BuiltinContext, and add a configuration option to allow admins to set the cache size. This way, we can keep the size as-is, but allow users to modify it (or even make it unbound) if cache performance becomes a concern.

As extra credits, we could even add something like a glob_cache_usage/glob_cache_full metric and/or log messages for when the cache is full to inform users that they've reached a performance-degrading situation that can be affected through conf.