Closed pruivo closed 3 weeks ago
Took a quick look inside the realm
cache and it should require 4 entries per client
<client-uuid>.optional.clientscopes -> ClientScopeListQuery
<client-uuid>.default.clientscopes -> ClientScopeListQuery
realm-0.client.query.by.clientId.<client-id> -> ClientListQuery
<client-uuid> -> CachedClient
There are other values stored in the realms
cache as follows:
CachedClientRole
CachedClient
CachedGroup
CacheRealmRole
CachedCount
ClientScopeListQuery
ClientListQuery
RealmListQuery
I'm wondering why users got their cache but clients didn't 🤔
A single query to fetch the default and optional scopes and store them in ClientScopeListQuery
. Not only save 1 cache entry but 1 database access too.
Clients are only cached on the second access. When the cache entry realm-0.client.query.by.clientId.<client-id> -> ClientListQuery
is created, it does not cache the client. Why not!? It is loaded from the database and can be cached right away. The second access will cache it.
The cache entry realm-0.client.query.by.clientId.<client-id> -> ClientListQuery
could be removed if the client's UUID was generated from realm
+client-id
. It would break existing databases and can never be implemented 😢
Summary Using half of the client brings the DB usage to ~65% and the 99% response time to ~500ms. Cache hit ratio around ~75%
Command Line
./benchmark.sh "eu-west-1" --scenario="keycloak.scenario.authentication.ClientSecret" --server-url="***" --users-per-sec=1000 --measurement=600 --realm-name=realm-0 --logout-percentage=100 --users-per-realm=20000 --clients-per-realm=10000 --ramp-up=20 --log-http-on-failure --refresh-token-count=0 --refresh-token-period=0 --sla-error-percentage=0.001
Database Usage
Gatling Results
Keycloak Response Times
Caches Hit Ratio
@ahus1 @mhajas Do we want to reduce the RPS to reduce the load on the database below 50%? What is the target 99% response time that you have in mind?
Summary DB usage dropped to ~50% and the 99% response time to 77ms. The cache hit ratio is ~75%.
Database Usage
Gatling Results
Keycloak Response Times
Thanks, these numbers look good. Still one thing that concerns me: There is a http status code 401 there meaning access denied. This is unexpected. Can you please have a look? If there is a 401, this could mean the wrong password or the client doesn't exist or something different. This might then lead to a very different load pattern. Thanks!
Cache usage: 2x in users cache, 3x in realms cache per client.
-> 10k clients
-> reduce RPS to a number where the DB is no longer overload -> reduce to 50% to not overload the database
-> Write some docs on how to size users and client cache based on the number of clients
Follow-up task: update the sizing guide for RHBK26