envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.75k stars 4.76k forks source link

rbac: principals evaluation performance #12285

Open sschepens opened 4 years ago

sschepens commented 4 years ago

We want to evaluate RBAC policies and we have 1.3k "principals", users we want to allow, extracted nowadays via jwt_authn metadata, and these users can vary form endpoints allowed. The actual performance of RBAC is not great for these use cases since it is linear to the amounts of principals. Current structure allows to define how to extract the value of the principal and only match it against a single matcher (StringMatch in our case).

I can see two possibilities for improvement:

The first alternative would be restrictive to principal matchers that internally use StringMatcher, but other matchers could be left out, and it would probably be nice to support set lookup in other matchers too.

sschepens commented 4 years ago

@mattklein123 i could probably have a go at a PR implementing the set matcher in SringMatcher, but this probably needs some discussion if this is the best approach.

mattklein123 commented 4 years ago

Thanks for opening. For reference, there is some related efforts here:

Optimally, we would come up with a common structure here around matching that allows for the tree with linear leaves structure proposed in https://github.com/envoyproxy/envoy/issues/6602.

This would allow us to hopefully use a similar API across routing, matching, etc. Is it possible to take a look at all of these issues and then come back with a proposal? It might also make sense to sync up with @yangminzhu about the unified matching proposal. Thank you!

yangminzhu commented 4 years ago

Thanks @sschepens for starting this.

There is a work in Istio started recently to do a large scale performance test of Istio AuthZ policy that uses the Envoy RBAC filter under the hood, the test is trying to get the performance data (e.g. latency, memory and etc.) of the RBAC filter with large configs (e.g. 10k path rules, principals and etc.). @MichaelEizaguirre is working on this from Istio side, hope we can share some data soon. This is a very good example that why such performance data would be very helpful for others.

The RBAC filter (especially the matcher part) currently is not optimized for performance just like you mentioned, for example, it only does liner search for strings within the OR/AND list matcher and the same applies to source IP matching which is another common use case that could benefit a lot from the performance optimization.

It might also make sense to sync up with @yangminzhu about the unified matching proposal.

@mattklein123 For the unified matching API, I have looked into the tap filter and I think the unified matching API could just be based on the existing tap filter matcher, the tap filter matcher has a very flexible interface and good support of streaming matching already. I don't see any significant blocking issues so far, will try to get the design updated with a PoC so that we can discuss further in next community meeting.

i could probably have a go at a PR implementing the set matcher in SringMatcher, but this probably needs some discussion if this is the best approach.

I think this is possible and I agree we need some investigations to see if there are better approaches. Depending on how to add such optimizations, it might be completely orthogonal and independent to the unified matching API if it's added inside the StringMatcher, or it may be in a different level that affects the unified matching API design, like a list of StringMatcher in the unified matching API.

mattklein123 commented 4 years ago

@mattklein123 For the unified matching API, I have looked into the tap filter and I think the unified matching API could just be based on the existing tap filter matcher, the tap filter matcher has a very flexible interface and good support of streaming matching already. I don't see any significant blocking issues so far, will try to get the design updated with a PoC so that we can discuss further in next community meeting.

Awesome!

I think this is possible and I agree we need some investigations to see if there are better approaches. Depending on how to add such optimizations, it might be completely orthogonal and independent to the unified matching API if it's added inside the StringMatcher, or it may be in a different level that affects the unified matching API design, like a list of StringMatcher in the unified matching API.

What I'm getting at here is that I think in general we want to move to a route/etc. matching system in which we have a tree where the lives can be linear matchers. So basically the only types of matchers that would work for the tree portion would be ones that can be expressed as a trie/hash/etc. with the existing linear matching system for routes, rbac, etc. being the leaves. This allows for keeping all existing functionality while still allowing for a high performance path as long as the matching rules are amenable to tree matching. It would be optimal to think about this holistically across Envoy vs. doing a one off here.

znqjnny commented 2 years ago

Hi @mattklein123 , regards the performance issue that discussed above. I wonder do you have a plan to optimize this ongoing right now or in the future?

mattklein123 commented 2 years ago

Hi @mattklein123 , regards the performance issue that discussed above. I wonder do you have a plan to optimize this ongoing right now or in the future?

I'm not sure of any current work in this area, but unified matching is now implemented and I don't think it would be too hard to do better here. cc @kyessenov who might know current plans.

kyessenov commented 2 years ago

Unified matchers have the built-in exact match map which I think solves the issue of hash-map lookup by a principal. I think we are missing some "DataInputs" (e.g. dynamic metadata from jwt_authn) but that's a work in progress and can be added incrementally.

znqjnny commented 2 years ago

Thanks so much for the information, @mattklein123 and @kyessenov.