Closed egernst closed 3 years ago
What is the best way you've found to monitor those packet retry/losses so far @egernst ? Maybe we can enhance/modify our existing CC ab/nginx test and port it over to Kata as a start, aiming to add something to the metrics CI maybe.
cc/ @jon @amshinde
Well, turned out tuning TCP isn't what's really going to change the performance here (though still a good idea). The primary issue for poor performance is that when using overlay, index.html is stored a 9pvolume, and thus access is slow for each established session.
When testing with devicemapper instead of overlay, performance becomes very reasonable. Some data measured in an AWS i3.metal xenial machine:
req/sec
--
concurrent requests | kata
100 | 20,544.10
200 | 18,862.31
500 | 18,475.07
1000 | 19,218.69
This matches what I would normally expect for nginx on kata (and performs well compared to runc). These numbers are grabbed using
ab -n 10000 -c <concurrent value> http://<myserver>:8080
and the nginx server is started with
docker run --runtime=kata-runtime -itd --rm --cpus=8 --memory=16G -p 8080:80 nginx
The memory is pretty arbitrary since we aren't memory bound, and CPUs was just used to match the default number of queues used today in Kata (again, this will be made configurable and default will match number of vCPUs). Nginx is configured to use 8 workers and support 8192 worker-connections.
We can close this issue after we have documentation / collateral in place to describe a couple of ways to work around this issue (ie, use devicemapper or avoiding 9p based volumes by using ramfs or block backed volumes).
One of my suspicions around the 9p hit here is that, afaik, we run 9p in 'default cache mode', which I believe is 'cache=none'. 9p and caching is a bit of an area of tradeoffs aiui. @bergwolf @gnawux @WeiZhang555 - do you guys have any input and wisdom from your previous experiences of 9p optimisations? I suspect if the index file was cached then the performance would go up with 9p as well. But, we'd want to be sure what the side effects of enabling the cache are etc. The theory should be pretty easy to test out itself though.
@grahamwhaley A late reply :stuck_out_tongue:
9pfs with cache has some quite anoying problems of synchronizing data between host and guest, though it has better performance, I think it's better to keep cache mode closed. The syncing problem will influence some use cases(such as logs/configMap via 9pfs), and I think the side effect is larger than benefit. It's only my 2 cents
Description of problem
When running nginx server, testing with tools like
ab
andhey
show inconsistent and poor performance with respect to request handling rate (req/sec)Expected result
A request is 612 Bytes. I expect that the TCP throughput of Kata Containers should be sufficient enough to handle a higher rate of requests. For example, iperf results on same machine measured: 512B transmission: 8.82 Gbps bw 1024B transmission: 18.1 Gbps bw
Actual result
There are many retries some errors observed when running. Resulting req/sec rate is too low when running with kata-runtime. running "vmstat 1" in the container shows that the actual cpu is pretty idle, and we are not memory bound.
To run:
start nginx server container:
Note, you should also constrain the number of vcpus/memory below:
exercise nginx:
I would run either ab or hey for a period of time (ie, 60 seconds), each time with a different level of concurrency (ie, 100, 200, 500, 1000).
analysis / observations
I started off testing with just ab, but found that hey was a bit nicer to look at, and provided a better summary of errors observed. Results seem unreliable, though, if there are too many errors (resolved after adjusting ulimit). See https://github.com/rakyll/hey
ulimit settings on host / guest:
Saw errors as a result of settings on host and guest: [17142301] Get http://10.7.200.165:8080/: dial tcp 10.7.200.165:8080: socket: too many open files
This would result in unreliable req/sec results. Need to update ulimit to a more sane value.
On i3.metal on AWS with xenial installed, the host originally shows:
Updated:
number of queues for macvtap
Today this is hardcoded to 8: https://github.com/kata-containers/runtime/blob/16600efc1da0dc893c1a12424902553cf7d1266f/virtcontainers/network.go#L97
This should be set equal to the number of vCPUs (or less) by default, and be made configurable.
Tuning nginx itself:
Adjustments to the number of worker processes and worker_connections (through /etc/nginx/nginx.conf): default: 1, 1024 adjusted: 8, 8192
With this adjustmnet
sysctl tuning within the guest:
Sync queue sizes: tcp_max_syn_backlog
details from experiments tbd
somaxconn
details from experiments tbd
Sample result:
Sample result without making adjustments (req/sec is bogus due to socket / too many open files issue).