0rpc / zerorpc-python

zerorpc for python
http://www.zerorpc.io
Other
3.17k stars 378 forks source link

gevent_zeromq bug triggered when zerorpc does not run in the main thread #224

Open Overv opened 4 years ago

Overv commented 4 years ago

If I run the server or client in a thread other than the main thread, it seems to trigger the missing event bug:

import zerorpc
import threading
import time

def run_server():
    class HelloRPC:
        def hello(self, name, age):
            return f"Hello, {name} of age {age}"

    s = zerorpc.Server(HelloRPC())
    s.bind("tcp://0.0.0.0:1234")
    s.run()

# Start server in background
server_thread = threading.Thread(target=run_server)

server_thread.daemon = True
server_thread.start()

# Run client
c = zerorpc.Client("tcp://127.0.0.1:1234")

for _ in range(2):
    print(c.hello("Thread", 1234))
    time.sleep(2)

Output:

$ python experiments/rpc_threading/gevent_bug.py
Hello, Thread of age 1234
/!\ gevent_zeromq BUG /!\ catching up after missing event (RECV) /!\
Hello, Thread of age 1234

How do I fix this? Shouldn't it work fine since the client and server both have their own thread-local event loop with gevent?

bombela commented 4 years ago

Using threads with gevent is a dangerous game. In your case, you are sharing the zerorpc context across threads. You should instantiate a new zerorpc.Context per thread, and pass it to zerorpc.Client/Server explicitly.

The missed event warning normally occurs because there is a race condition when waiting for events on the zmq socket. The reason is the zmq socket is edge triggered, and not level triggered. zerorpc has to be waiting for an event before it is delivered, otherwise, it will be missed. There is this small window of time were it is possible to miss the event. zerorpc detects missed events by periodically checking the status of the zmq socket. The time between checks translates to a latency penalty on missed events.

Overv commented 4 years ago

Thanks for the quick response! Unfortunately even with a context per thread I do run into issues with gevent (like the dreaded LoopExit exception). Perhaps it's due to threads unexpectedly shutting down because I use it in combination with fusepy. I will go ahead and experiment more :)

Would it be possible to change zerorpc to not use gevent or is it a critical component for its functionality?