sibson / vncdotool

A command line VNC client and python library
Other
453 stars 121 forks source link

vncdotool under multiprocess requires "spawn" start method #188

Open guicho271828 opened 3 years ago

guicho271828 commented 3 years ago

A follow-up of #182 .

In a multiprocess-based environment, vncdotool requires multiprocessing.set_start_method("spawn") in order to run correctly. "spawn" is the default method on darwin, while "fork" is the default method on Linux. When the start method is "fork", transmission through RFB buffer may fall in a deadlock.

It seems that the lock state in an async message queue inside vncdotool is accidentally shared between processes created by multiprocessing. I don't know how python implements an async queue, but whatever it is using, it would be copied and then shared between processes if fork is used, as fork will copy the entire process state. spawn will avoid such accidental sharing, as it launches a new python interpreter.

To reproduce, check out this repo, which includes a docker container for tightvncserver and ratpoison. It launches multiple containers, then a python script tries to connect to them in parallel and perform a sequence of mouse clicks, and finally saving the screenshot. When fork is used, it does not produce a consistent result, and often freezes. When spawn is used, it always succeeds.

Potential fix is either

guicho271828 commented 3 years ago

check out this branch https://github.com/guicho271828/vncdotool/tree/multiprocessing-test

guicho271828 commented 3 years ago

any chance you try this, @sibson ?

sibson commented 3 years ago

This comment on SO, https://stackoverflow.com/a/66113051/170997 supports your analysis. Nice work. I'd be happy to merge a PR with either of your proposed fixes. Spawn seems like the least risky option and performance isn't something I'm too worried about.

guicho271828 commented 3 years ago

Spawn does not affect vncdotool's performance. It is problematic because a potential user using vncdotool may have a large number of modules being loaded and is a heavy user of multiprocessing under production. Since fork has many benefits (e.g. not just the time savings from loading the modules, but also memory savings due to copy-on-write), it must be supported if it can be supported.

The correct fix is to fix the queue implementation.

sibson commented 3 years ago

To possibly clarify, I was considering memory usage a facet of performance. The use cases that vncdotool was designed for are not ones where I expect memory optimization to be a design constraint. It could be people are using it in ways that are memory constrained, I'm not aware of them and thus don't feel a need to maintain the existing memory footprint. Additionally, I favour minimizing dependencies to keep installation as easy as possible and I'm not familiar with alternative queue implementations, so that seems like a higher risk option, would need more testing and be harder to debug if there are issues with the library.

guicho271828 commented 3 years ago

Anyways, the strategy for a fix would be to create a new queue each time a connection is opened. I guess some part of the queue is created & shared in the factory class, which is the culprit.

sibson commented 3 years ago

ah that makes sense, thanks.