Open hazelnutsgz opened 3 years ago
I don't believe we have this today but could be useful. cc @rojkov @jmarantz
To answer the question, I don't think there's a current feature in Envoy that would enable you to manually bind workers to cores.
I am assuming you are suggesting this because you see some potential for performance improvement relative to the automatic assignments done by the CPU/OS for some workload.
Here are some possible complexities with this path:
The number of cores you reserve for these auxiliary threads may be dependent on your hardware, your workload, how often you get xDS updates, and which extensions you have enabled.
My suggestion is to try to visualize the cpu usage per thread and see if you can find some behavior from automated thread/core bindings that you feel it's worth overriding. Then do a ton of benchmarks (see https://github.com/envoyproxy/nighthawk) to prove you've made things better.
It would be great to get some perf benefit from this!
Really appreciated replies from you guys. I haven't conduct the systematic benchmarking yet.
Actually what I did(I knew it is not convincing lol) is that I wrote a multiple-thread epoll proxy(with reuseport enabled, each thread serving for all potential ports) FROM SCRATCH, and did observed the performance gap concerning CPU binding.
My suggestion is to try to visualize the cpu usage per thread and see if you can find some behavior from automated >thread/core bindings that you feel it's worth overriding. Then do a ton of benchmarks (see >https://github.com/envoyproxy/nighthawk) to prove you've made things better.
Makes a lot of sense, I would take it.
It would be interesting to see how much perf gain could be achieved with thread pinning vs process pinning with taskset
within the same NUMA node. If your work load is orchestrated with Kubrenetes the latter can be configured with a topology aware kubelet.
@hazelnutsgz are you going to work on this? If not, I'm interested to take a look at this issues, see what we can found.
@soulxu I'm interested in this feature. Do you have anything forked?
@soulxu I'm interested in this feature. Do you have anything forked?
please go ahead, I'm working other stuff, really don't have bandwidth to work on this for now.
I was wondering is there any feature/API that could bind the worker to a dedicated CPU? Something like code in
tools/perf/bench/epoll-wait.c
Thanks~