Open drolando opened 7 years ago
Hi, it is a common misconception that "even load balancing" between CPUs is a good thing by default.
The kernel (in theory) knows which process to wake up on a file descriptor considering various factors (expecially cpu cache).
This is why, without a specific locking strategy, you will always get "unfair" distribution.
Having said that, it is a super-optimistic way of thinking if we believe the kernel is 100% right in its decisions :)
What you want to check is the cpu affinity configuration (--cpu-affinity)
There are a bunch of posts online about various usages, but i strongly suggest you to start from here (old 2011 post):
http://lists.unbit.it/pipermail/uwsgi/2011-March/001594.html
We also have problem with unfair load distribution between processes under gevent loop engine.
I hope that thunder-lock
would help in our situation.
I don't understand why --thunder-lock
option is not implemented in loop engines other than simple
and rbthreads
(btw. there is misleading output "thunder lock: enabled" on startup while in fact thunder lock is not used).
@unbit Is there any technical reason for not implementing thunder-lock
in the most of uwsgi loop engines?
BTW: Better load distribution among processes with --lock-engine ipcsem --thunder-lock
is mentioned in an official documentation:
https://uwsgi-docs.readthedocs.io/en/latest/articles/SerializingAccept.html#when-sysv-ipc-semaphores-are-a-better-choice
We've been experimenting a bit with using the
gevent
plugin: it usually works great, however it causes most of the requests to be routed to the same worker. What we see is that like 70% to the request go to a worker, 15% go to a second one and so on.I added some logging and noticed that libev tends to notify the watchers mostly in the same order, which leads to most requests going to the first worker. The only reason why it's not getting all of them is because sometimes it's doing some CPU works and is not blocked on io, so requests coming in at that point will be accepted by another worker.
Normal uWSGI also has a similar problem, which we solved by using
thunder-lock
and thesysv ipc
semaphores. However that won't work here since the gevent plugin useswsgi_req_simple_accept
instead ofwsgi_req_accept
which doesn't use thunder-lock.Has anyone else had the same problem and do you know if there's an easy way to fix it?