Open reticulatedpines opened 1 year ago
Using vncdotool.api
creates a background-daemon-thread, which runs the Twisted Reactor: If not explicitly terminated by calling api.shutdown()
it will keep running. While it is marked as a daemon thread, which should not prevent Python from terminating, but mixing threading with multi-processing leads to all kinds of obscure issues.
gdb -p 117319 --batch -ex 'thread apply all py-bt'
Thread 3 (Thread 0x7f3fa3fff700 (LWP 117322) "python3"):
Traceback (most recent call first):
File "/usr/lib/python3.9/threading.py", line 312, in wait
waiter.acquire()
File "/usr/lib/python3.9/queue.py", line 171, in get
self.not_empty.wait()
File "vncdotool/venv/lib/python3.9/site-packages/twisted/_threads/_threadworker.py", line 46, in work
for task in iter(queue.get, _stop):
File "/usr/lib/python3.9/threading.py", line 892, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.9/threading.py", line 954, in _bootstrap_inner
self.run()
File "/usr/lib/python3.9/threading.py", line 912, in _bootstrap
self._bootstrap_inner()
Thread 2 (Thread 0x7f3fa8b2d700 (LWP 117321) "python3"):
Traceback (most recent call first):
File "vncdotool/venv/lib/python3.9/site-packages/twisted/internet/epollreactor.py", line 227, in doPoll
l = self._poller.poll(timeout, len(self._selectables))
File "vncdotool/venv/lib/python3.9/site-packages/twisted/internet/base.py", line 1331, in mainLoop
reactorBaseSelf.doIteration(t)
File "vncdotool/venv/lib/python3.9/site-packages/twisted/internet/base.py", line 1318, in run
self.mainLoop()
File "/usr/lib/python3.9/threading.py", line 892, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.9/threading.py", line 954, in _bootstrap_inner
self.run()
File "/usr/lib/python3.9/threading.py", line 912, in _bootstrap
self._bootstrap_inner()
Thread 1 (Thread 0x7f3faab4c740 (LWP 117319) "python3"):
Traceback (most recent call first):
File "/usr/lib/python3.9/threading.py", line 1428, in _shutdown
lock.acquire()
Looking at the threads and their .name
attribute:
main
thread, which is calling exit()
Twisted
, the main reactor by vncdotool.api.connect()
PoolThread-twisted.internet.reactor-0
from twisted/python/threadpool.py
; According to the code it inherits daemon
from its parent thread 2.PS: As stated above calling api.shutdown()
before sys.exit(0)
mitigates the issue, but we should find a real fix.
Ah, great, thanks for the info and quick workaround! I didn't know where to even start debugging twisted + multiprocessing. I can confirm that shutdown() lets it exit cleanly for me here, too. I thought it was weird that join() still succeeded, makes sense now.
It's curious that 1.0.0 doesn't exhibit this behaviour, I've run 100s of tests and it's been consistent, so I'd guess not a race or similar. It's beyond my ability to diagnose, so I'll leave that up to you.
Possibly docstring for api.connect() could mention shutdown() as being wise to call before attempting to exit? I don't know if that's an appropriate place - I would have seen it, but who knows if I'm typical.
Maybe the api object could be made into a context manager too, then there's an obvious place to automatically call shutdown(), though you'd have to nest two contexts just to get a client which is kind of ugly. Since nobody is using api as a context manager currently it shouldn't break existing code.
with vncdotool.api as a:
with a.connect(":1234") as client:
client.stuff()
# client teardown happens here
# api teardown here
I don't love it, but I guess it's not terrible looking.
Less likely to be appropriate, shutdown() could implicitly occur on leaving the context manager block for the connect()? Perhaps that's too surprising a side effect, since it's the client that leaves scope, not the api. More likely to break existing code.
Possibly, use atexit: https://docs.python.org/3/library/atexit.html I have no experience using this, it reads like conceptually the right thing to do. Register the handler to call shutdown() whenever the main reactor is started. It's stdlib which is nice.
Sadly for me, vncdotool.api.shutdown() doesn't help me in the larger code I was working on. Perhaps I found two different hang bugs. I'll try and work up another clean repro case.
Found my mistake: in the more complex code, I had multiple processes each handling a VNC connection. I'd missed adding api.shutdown() to one exit route. Your workaround is still effective.
Possible regression from 1.0.0, it seems that 1.1.0 doesn't let multiprocessing workers exit (even though they join). This means the main script also refuses to exit. Only tested on Linux (Debian Bullseye, and Fedora 37, identical behaviour).
I don't have much experience with vncdotool, multiprocessing or twisted, so I could easily be missing something obvious here. I do have a simple repro script though so it's hopefully easy to test. I tested in a venv with only these listed by freeze:
I didn't control for system packages, please let me know if that's relevant.
vncdotool version 1.1.0
VNC server and version None required.
Steps to reproduce Make a python3 venv, install vncdotool 1.1.0. Run the following repro script, it will fail to exit. Or fail to repro, let's find out :) Downgrade to 1.0.0 and it should exit.
Expected result Above repro script should exit.
Which erroneous result did you get instead Script hangs just after final print statement.
Additional information Possibly related to https://github.com/sibson/vncdotool/issues/188 But that lists using spawn start method as a workaround, and that doesn't help me.