python-zk / kazoo

Kazoo is a high-level Python library that makes it easier to use Apache Zookeeper.
https://kazoo.readthedocs.io
Apache License 2.0
1.3k stars 387 forks source link

Calling `zk_client.stop()` from a worker greenlet causes a deadlock #426

Open stefanmh opened 7 years ago

stefanmh commented 7 years ago

PoC:

import gevent
import gevent.monkey
gevent.monkey.patch_all()

from kazoo.client import KazooClient
from kazoo.handlers.gevent import SequentialGeventHandler

zk = KazooClient(hosts='127.0.0.1', handler=SequentialGeventHandler())
zk.start()

def f(_):
    print 'stopping'
    zk.stop()
    print 'stopped'

zk.get_children('/', watch=f)

zk.create('/stefan-poc', ephemeral=True)

def tick():
    while True:
        print 'ticking'
        gevent.sleep(1)
gevent.spawn(tick)

gevent.sleep(5)

print 'exiting'

Actual output:

stopping
ticking
ticking
ticking
ticking
ticking
exiting
ticking
ticking
ticking

zk.stop() pushes a _STOP to the worker's queue and then joins on the greenlet. However, the worker never gets to process the _STOP message, as it's stuck waiting for f to complete.

The documentation should probably be improved, or a fix should be considered (maybe a stop_async?).

The PoC also illustrates a second issue: zk.stop() is registered to run atexit, so the python script above never terminates (only one zk.stop() can run at a time, and the first run is frozen). Registering an atexit callback that blocks should be avoided.