Open wacuuu opened 4 years ago
60 conatainers, 40 cores, 10 perf counters - it gives 24 000 perf metrics (60x40x10), data volume of scraped Prometheus metrics: 144.5 MB (156 MB - 11.5MB based on data from table)
I think that we can try to aggregate perf metrics by "event" and "id" and expose them in this form on Prometheus endpoint. Aggregated form of metrics for 60 containers, 40 cores, 10 perf counters will have 600 perf metrics (60x10) and estimated data volume for aggregated perf metrics is about 3.6125 MB (144.5/40) so significantly less. In my opinion we could add an additional runtime parameter for cAdvisor e.g. --perf_aggregate=true
@dashpole what do you think?
Idea with aggregation is shown in https://github.com/google/cadvisor/pull/2611
Could we make perf metrics respect the percpu disable_metrics parameter?
I think that we can use percpu disable_metrics parameter for perf metrics
Hi,
I did some performance measurement to get an insight on how would cAdvisor with perfs enabled behave on production-like system. A word about my setup:
I have cAdvisor deployed as daemoset as it is from examples, only differences are that CPU limit is set to 5(to avoid bottleneck here) and I have changed perf configs. As for the load, I have a deployment that consist of a single pod with two containers, they only exist to be entities for metrics generation. Besides cAdvisor and load the machine has only the containers responsible for keeping the node in the cluster(network manager, internal Kubernetes services etc.)
To measure response time and data volume, I executed the following command from the node running load and cAdvisor:
time curl localhost:8080/metrics > /dev/null
These are the results:
I would like to emphasize, that 60 containers on 40 core machine is not the worst case scenario that could happen. Also, in terms of data scraping this is optimistic case, where the process is not exposed to datacenter network, in which typically there would be traffic from other applications and other nodes.
Production environment would be expected to have hundreds of nodes, on each cAdvisor measuring couple of perf events, with hundreds of containers per node, all this scrapped every couple of seconds. With numbers like in the tests, it is highly unlikely that this setup would not brick production with network overload. Therefore there is a need to do some optimization in the amount of data served with perf events.