memoryUsageTest in ROSA Scaling Benchmark produces negative values for activeSessionsPer500MbPerPod metric

kami619 commented 4 months ago

Describe the bug

We are seeing negative values in the result generated from the ROSA Scaling Benchmark github workflow.

Version

keycloak-benchmark latest main

Expected behavior

We should see predictable and accurate values for this metric

Actual behavior

Runs with negative values for reference:

How to Reproduce?

No response

Anything else?

No response

kami619 commented 4 months ago

Variances were captured in the horreum, however we need to prioritize in setting up alerts based on these threshold breaches.

cc: @ahus1 @andyuk1986

Screenshot 2024-03-28 at 13 10 36

andyuk1986 commented 4 months ago

Thanks @kami619 , I will look into this.

andyuk1986 commented 4 months ago

Hey @ahus1 , I was looking into prometheus metrics which can be used as a substitute for the container_memory_working_set_bytes, and collected a few of them which I think could be used. container_memory_rss -> this metric represents the amount of memory that is actually in physical RAM container_memory_usage_bytes -> this metric represents the total memory usage of the container, including all memory regardless of its source

After some googling I have found ~ formulas: container_memory_working_set_bytes = container_memory_rss + total_active_file container_memory_usage_bytes = container_memory_rss + total_inactive_file + total_active_file

so I am not sure if the metrics above will be correct as container_memory_working_set_bytes is based on them.

There are 2 more: node_memory_Active_bytes -> shows the amount of memory that is currently in use by active processes on the node (i.e. all pods) and seems that it shows the data for all processes (keycloak, ispn, etc), when I try to specify the namespace and container name, it returns empty.

jvm_memory_used_bytes -> represents the amount of memory currently being used by the JVM across all memory areas. And here it is possible to specify also to show data for which namespace and which container;

I think we can choose one of the last 2 ones, wdyt?

ahus1 commented 4 months ago

@andyuk1986 - thank you for looking into these metrics.

AFAIK node_memory_Active_bytes is not a fit here, as its results are diluted by the other pods running the node where Keycloak runs, and we won't be able to distinguish the increased memory of Keycloak during the run. IMHO node_memory_Active_bytes is a sum of container_memory_*, so it is likely to result in the same problem we see in the current setup.

Looking at jvm_memory_used_bytes this might be a route to a possible metric. Unfortunately, when looking at our current metrics, it is fluctuating quite lot, and we don't have a step-wise increase like we had with the old memory settings:

So I think we have some options here:

Go down deeper on this route, and trigger for example a full GC before and after the load test run, and then wait for a moment for the monitoring to pick up that change - this is fragile and I'd like to avoid it
Maybe Infinispan has a metric to report the memory size of its entries. As the moment I see vendor_statistics_data_memory_used, but that always seems to return 0. Might that be the case as we don't have eviction based on memory size enabled on any of our caches?
We could be bold to drop this metric altogether as we would promote persistent sessions soon, and then there wouldn't be the need to keep all sessions in memory.

Happy to hear your thoughts.

andyuk1986 commented 4 months ago

Hey @ahus1 , so yes - vendor_statistics_data_memory_used is always 0 as we don't have size based eviction on caches. We don't have any other metric showing memory consumption by entries, only entries number.

I have found one more formula which can be used: node_memory_MemTotal_bytes - (node_memory_MemFree_bytes + node_memory_Buffers_bytes + node_memory_Cached_bytes) but it shows total memory usage in bytes on node (without specifying the namespace or container).

Perhaps we can leave it as it is now especially that the last 10 days the numbers are positive only. I will look through more metrics when have some time to see if we can replace it with other one. WDYT?

ahus1 commented 4 months ago

Given your summary, I agree to stick with the current metric. I see that Kamesh is about to add a pod restart at the beginning, of the test and this might help to get a better metric.

About memory based eviction: I hear now and then that people find it complicated to pick the right number of cache entries, and would want to pick instead the memory. Keycloak has been been using number of entries as long as I can think, and I don't recall why it has been picked.

Are you aware of a drawbacks of memory based eviction? We could bring it to one of our team meetings to discuss it. If we would change, we would get a better metric, and help customers to simplify sizing their application memory.

With local caches, I would assume Infinispan would need to run the serializer to compute the size, and this would be CPU overhead?

andyuk1986 commented 4 months ago

So basically eviction lets you control the size of the data container by removing entries from memory when the container becomes larger than a configured threshold, i.e. either when the total number of entries is reached ( configured with max-count) or when the maximum amount of memory is allocated ( configured with max-size).

Eviction drops one entry from the data container at a time and is local to the node on which it occurs. So for having data consistency within the cluster it is necessary also to configure a persistence storage where the data will be stored. The persistence storage could be configured with passivation enabled or disabled. In case of disabled passivation, data always goes to store and cache and persistence store are always superset. With the passivation is enabled, then data goes to persistence store only when it is evicted from the cache or in case of node shutdown.

But there is one more thing: I think in our case with clustered caches we would need to use a shared persistence store which will be shared between nodes, and enabled passivation is not allowed with shared cache stores, so that would mean that the data will be duplicated in both in-memory-cache and in persistence store which could be an overhead.

More information regarding performance impact from our performance tuning guide:

Eviction has some impact on performance which comes from the additional processing that Infinispan needs to calculate when the size of a cache reaches the configured threshold. Eviction can also slow down read operations. For example, if a read operation retrieves an entry from a cache store, Infinispan brings that entry into memory and then evicts another entry. This eviction process can include writing the newly evicted entry to the cache store, if using passivation. When this happens, the read operation does not return the value until the eviction process is complete.

@pruivo please correct me if I am wrong or has missed something.

pruivo commented 4 months ago

I don't know why we need a shared store. The only advantage is the data lives outside Keycloak servers, which has a benefit if the Keycloak servers are nuked. In any case, there is no overhead without passivation since it will evict the least used keys from memory anyway.

The memory-base eviction does not work with the current KC code. It has to marshall the entries to compute their size and, it stores them in binary format. And we know the replacement user/client sessions will fail in that scenario.

ahus1 commented 4 months ago

With the metric being stable after the restart, we don't invest time here for now. With persistent sessions on the horizon, this metric will be less import in the future, and eventually removed from the sizing guide.

keycloak / keycloak-benchmark