Open errantmind opened 3 years ago
@errantmind This is definitely something worth digging into deeper. @msmith-techempower and I looked at this a while back and we did not find any performance degradation, though it's been some time / updates since and certainly things may have changed. At least, if this is the case, all frameworks should be affected the same.
We may not have time to take a look at this in the next couple of weeks, but feel free to drop more info here if you have it. Benchmarking logs with and without the default bridge would be helpful if you have them. Also curious if you were doing this on a single machine or using a mutli machine set up like we do on our Citrine environment.
Thanks for the report!
I'm doing this on a single machine so that could be a factor. I have tried multiple frameworks, each which experience the degradation in network throughput (req/s) of about 20%, so you may be right in saying all frameworks should be affected the same. However, I think it is worth looking into at some point because of how it might be affecting the top-end frameworks, which are already very close to each other in performance. Without this overhead (if it exists in your multi-machine environment) it may be possible for them to further differentiate. I'm working on a framework myself and am short on time, but after I get it submitted I'll try to submit some detailed logs
Disabling userland proxy may alleviate this overhead.
Disabling userland proxy may alleviate this overhead.
yes. try setting '"userland-proxy": false' in your daemon.json (usually at /etc/docker/daemon.json) and restarting docker. the overhead should be nowhere near 20% with this disabled.
At MS we run all the TE benchmarks with --network host
for the same reasons.
OS (Please include kernel version)
Linux -- Ubuntu 20.04.2 LTS -- 5.11.7-051107-generic
Expected Behavior
Docker network configuration does not significantly degrade / bottleneck network throughput
Actual Behavior
Docker reduces framework throughput by ~20% just by being installed, even when not in use, and may also be bottlenecking the benchmark
Steps to reproduce behavior
wrk
and without docker installed. Note throughput of framework in benchmark, as well as latencydocker-ce
(which also installsdocker-ce-cli
andcontainerd.io
)bullseye-slim
. Runwrk
on host against framework running in the container and note performance is roughly the same as the previous step. Also note this does not change much when specifying--network host
Other details and logs
In my tests, if you run
docker network ls
and see anyBridge
networks, including the default one, system-wide performance is degraded. The default bridge network cannot be removed by any normal means (i.e.docker network rm
). If these steps are followed, the default Bridge network can be removed. Run the benchmark after following the steps and notice performance is almost restored to 'non-docker' levels. I still saw a 5% throughput degradation after removing this network but it was much better than otherwise. Note, TechEmpower installs a network calledtfb
which creates a Bridge network so I am fairly confident this is an issue worth discussing.The basic reasoning I could find for the reduction in performance is Docker's default network configuration: it includes a
Bridge
network which enables iptables, which can slow down the whole system, even when docker is not in use. There are other network configurations which supposedly do not suffer from this issue, like using macvlan or ipvlan, although it may be good enough to just use--network host
without any bridge networks in existence.