unbit / uwsgi

uWSGI application server container
http://projects.unbit.it/uwsgi
Other
3.46k stars 692 forks source link

programmed mule logs "Resource temporarily unavailable" when running in a farm #1325

Open joshuahlang opened 8 years ago

joshuahlang commented 8 years ago

In a simple setup with a mule farm of >1 mules, I'm seeing the following error for around ~50% of the messages I send to the mule farm:

read(): Resource temporarily unavailable [plugins/python/uwsgi_pymodule.c line 1532]

Here is a simple repro (triggered by hitting http://localhost:8080 a few times quickly)

uwsgi --mule=mule.py --mule=mule.py --farm=test:1,2 --http :8080 --wsgi-file app.py

mule.py:

import uwsgi
import os

while True:
    msg = uwsgi.farm_get_msg()
    if msg is not None:
        print(os.getpid(), msg)

app.py:

import uwsgi

def application(env, start_response):
    start_response('200 OK', [('Content-Type','text/html')])
    uwsgi.farm_msg(
        'test',
        'message for the mule to process'
    )
    return [b"Hello World"]
joshuahlang commented 8 years ago

Has anyone else seen this? Or have any ideas on how to work around this issue?

marc1n commented 8 years ago

Probably same reason as in #1266 : uwsgi.farm_get_msg() is using non-blocking stream and got EAGAIN/EWOULDBLOCK (errno 11: Resource temporarily unavailable) error while reading.

From "man 2 read":

EAGAIN The file descriptor fd refers to a file other than a socket and has been marked nonblocking (O_NONBLOCK), and the read would block. EAGAIN or EWOULDBLOCK The file descriptor fd refers to a socket and has been marked nonblocking (O_NONBLOCK), and the read would block. POSIX.1-2001 allows either error to be returned for this case, and does not require these constants to have the same value, so a portable application should check for both possibilities.

Here is uwsgi code fragment from https://github.com/unbit/uwsgi/blob/master/plugins/python/uwsgi_pymodule.c#L1524:

    for(i=0;i<count;i++) {
        if (farmpoll[i].revents & POLLIN) {
                len = read(farmpoll[i].fd, message, 65536);
            break;
        }
    }
        UWSGI_GET_GIL;
        if (len <= 0) {
                uwsgi_error("read()");
        free(farmpoll);
                Py_INCREF(Py_None);
                return Py_None;
        }
joshuahlang commented 8 years ago

In theory some code above that snippet has already indicated there is data on the socket to read, i.e. {code} if (farmpoll[i].revents & POLLIN) { {code}

My guess is multiple mules are reading from a shared pipe leading to a race condition where all mules are signaled that data is available to read, but only the first reader actually reads the data and all other readers fail with EAGAIN (unless something is written to the socket before the read occurs). If that is the case, it seems as if uWSGI should ignore that error.