unbit / uwsgi

uWSGI application server container
http://projects.unbit.it/uwsgi
Other
3.46k stars 691 forks source link

gevent plugin causes uneven load balancing between workers #1605

Open drolando opened 7 years ago

drolando commented 7 years ago

We've been experimenting a bit with using the gevent plugin: it usually works great, however it causes most of the requests to be routed to the same worker. What we see is that like 70% to the request go to a worker, 15% go to a second one and so on.

I added some logging and noticed that libev tends to notify the watchers mostly in the same order, which leads to most requests going to the first worker. The only reason why it's not getting all of them is because sometimes it's doing some CPU works and is not blocked on io, so requests coming in at that point will be accepted by another worker.

Normal uWSGI also has a similar problem, which we solved by using thunder-lock and the sysv ipc semaphores. However that won't work here since the gevent plugin uses wsgi_req_simple_accept instead of wsgi_req_accept which doesn't use thunder-lock.

Has anyone else had the same problem and do you know if there's an easy way to fix it?

unbit commented 7 years ago

Hi, it is a common misconception that "even load balancing" between CPUs is a good thing by default.

The kernel (in theory) knows which process to wake up on a file descriptor considering various factors (expecially cpu cache).

This is why, without a specific locking strategy, you will always get "unfair" distribution.

Having said that, it is a super-optimistic way of thinking if we believe the kernel is 100% right in its decisions :)

What you want to check is the cpu affinity configuration (--cpu-affinity)

There are a bunch of posts online about various usages, but i strongly suggest you to start from here (old 2011 post):

http://lists.unbit.it/pipermail/uwsgi/2011-March/001594.html

marc1n commented 6 years ago

We also have problem with unfair load distribution between processes under gevent loop engine.

I hope that thunder-lock would help in our situation.

I don't understand why --thunder-lock option is not implemented in loop engines other than simple and rbthreads (btw. there is misleading output "thunder lock: enabled" on startup while in fact thunder lock is not used).

@unbit Is there any technical reason for not implementing thunder-lock in the most of uwsgi loop engines?

marc1n commented 6 years ago

BTW: Better load distribution among processes with --lock-engine ipcsem --thunder-lock is mentioned in an official documentation: https://uwsgi-docs.readthedocs.io/en/latest/articles/SerializingAccept.html#when-sysv-ipc-semaphores-are-a-better-choice