Open felixsittenauer opened 6 years ago
Do we have an update??
This also happens with a 3 manager and 3 worker cluster and about 50 containers inside a single stack.
we have 3 master and 8 worker nodes, still facing the high cpu usage and the tasks are not being scheduled because of cpu consumption!
Are you scheduling the tasks on the manager node (are they drained)? Maybe they have RAM issue's ?
The experiment was done on AWS: The 3 manager nodes were Ubuntu 16.04 m5.xlarge instances, with 4 vCPUs (3,1 GHz, Xeon Platinum 8000) and 16 GB RAM. The 100 worker nodes were Ubuntu 16.04 m5.large instances, with 2 vCPUs (3,1 GHz, Xeon Platinum 8000) and 8 GB RAM. Manager and worker nodes are connected through a AWS Virtual Private Network (VPC) with up to 10 GBit/s. The Metrics were collected by Elastic Metricbeat. During the experiment the manager nodes consumed about 690MB and the worker nodes about 680MB memory.
And just to validate my doubts, did you drain the managers? (Because by default they also accept tasks). Did you also see the same cpu behaviour on the workers?
I did a performance test in order to test the scheduling performance of Docker swarm. For this purpose I measured the time it takes to schedule and start 1000 containers on 100 worker nodes. A cluster of 3 Manager nodes is used.
The graphs show the cpu usage of the 3 manager nodes and one worker node during the scheduling process. The time 0 is the time where the scheduling action was started.
In the first graph no service was ever created or scheduled before (Fresh cluster).
In the second graph the experiment was repeated several times before.
While all 1000 containers were scheduled and started in under 2,5 seconds the cpu usage is higher during the scheduling and is still over 150% 60 seconds after the scheduling finished.
What is going on here? Why has the fresh cluster a lower cpu usage?