zeromq / pyzmq

PyZMQ: Python bindings for zeromq
http://zguide.zeromq.org/py:all
BSD 3-Clause "New" or "Revised" License
3.66k stars 637 forks source link

FEAT: support AnyIO #1827

Open davidbrochart opened 1 year ago

davidbrochart commented 1 year ago

What pyzmq version?

25.0.0

What libzmq version?

4.3.4

Python version (and how it was installed)

python 3.11.0, via conda-forge

OS

ubuntu 22.04

What happened?

I would like to use pyzmq with an async context inside an AnyIO event loop, using a trio backend. From what I can see, pyzmq only supports asyncio. How do you feel about supporting AnyIO?

Code to reproduce bug

No response

Traceback, if applicable

No response

More info

No response

minrk commented 1 year ago

I don't object to it, but I'm not likely to implement it any time soon. The abstraction in zmq._future should make an anyio implementation pretty small, I would guess.

davidbrochart commented 1 year ago

Good to know, would you be open to a PR?

minrk commented 1 year ago

Yup!

minrk commented 6 months ago

I looked at this for a bit, and it appears anyio has decided against supporting FDs, which is required for zmq integration to be possible. So if zmq were to support anyio, it seems it would only work on asyncio, and require tornado's selector thread implementation.

tornado is the only event loop that appears to have bothered to support polling FDs on Windows, and pyzmq relies on tornado's implementation when working with asyncio on Windows.

It's possible that trio's workaround for pipes could be used for zmq FDs, but I don't know.

It might also be possible to port tornado's SelectorThread to anyio.

But the real answer is for these event loops to implement APIs to wait for an FD to be readable, as is possible in select, epoll, etc. It shouldn't be the responsibility of libraries to provide this basic functionality to the event loop.

davidbrochart commented 6 months ago

FWIW, I implemented something similar in Jupyverse (part of https://github.com/jupyter-server/jupyverse/pull/388).

minrk commented 6 months ago

Yeah, I think that approach would work. Just need to be extremely careful about races on the edge-triggered FD when working with threads, because accessing zmq.EVENTS can prevent the select from waking. Looks like what you have would mean one select call and thread per zmq socket, which is not very scalable. The tornado SelectThread merges all readers into a single select call, which is nice for an application like Jupyter, which may work with quite a few zmq sockets.

I guess anyio brings the bad ProactorEventLoop situation to all platforms since it has to be least-common-denominator, which is consistent, I guess, if kind of a worst-case scenario.

I've managed to test that trio's lowlevel.wait_readable works for the zmq FDs. So pyzmq could work with anyio on at least trio or asyncio (the only real options), by picking based on sniffio.current_async_library:

if asyncio:
    use existing selector_thread implementation
elif trio:
    use trio.wait_readable
else:
    raise unsupported
davidbrochart commented 6 months ago

But I think you're right that it should be part of AnyIO, if Trio has this feature.

minrk commented 6 months ago

I did some tests, and I'm pretty sure this can work. Unfortunate that we can't use ~any anyio APIs to accomplish this, but I think we can make it compatible with trio, at least.

davidbrochart commented 6 months ago

Let's continue the discussion from https://github.com/ipython/ipykernel/pull/1079#issuecomment-1974020784 here, maybe @agronholm has ideas.

agronholm commented 6 months ago

@minrk Could you first help me understand why anyio.wait_socket_readable() is not good enough here?

minrk commented 6 months ago

I think maybe it can, I will have a look. I've never known what kind of socket it is, but I can probably figure it out. Thanks for the pointer.

Since socket.fromfd duplicates the FD, doing that in order to pass the duplicate FD to APIs where the existing one works is less than ideal when FD exhaustion is already an issue in applications like Jupyter. Especially when the underlying implementations support the integer FDs already.

from the docs:

This does NOT work on Windows when using the asyncio backend with a proactor event loop (default on py3.8+).

This would be a blocker for using it, though, since this is ~all Windows users. It would be wonderful if anyio ported tornado's SelectorThread to restore this functionality to asyncio for feature parity across implementations. CPython devs even agree it should be in asyncio itself, though nobody has the time to bring it in (copying trio's AFD_POLL would be even better, if harder). As it is now, since we have to detect asyncio and call lower-level APIs to make asyncio work, we might as well do the same for trio, which avoids the duplication of every FD.

agronholm commented 6 months ago

Ok, so the lack of proactor loop support for this is the crux of the problem. I'm open to allowing a hack like SelectorThread to be added to AnyIO for this, so long as I can agree with the implementation.

agronholm commented 6 months ago

That said, do you know why pyzmq needs this functionality? IIRC, AnyIO only uses it internally to implement UNIX domain sockets support.

minrk commented 6 months ago

Yes, very much so. I don't think I have the necessary expertise to port the selector thread into something that fits appropriately in the structured concurrency pattern, since where it is now is scoped only to the whole event loop, which I'm guessing is not the right fit, but I'm not sure.

agronholm commented 6 months ago

A potential solution would be to keep up that selector thread so long as there's any running calls to wait_socket_(readable|writable).

minrk commented 6 months ago

Yeah, that's one option. I think tornado's solution of the selector existing for the lifetime of the asyncio loop makes the most sense, but I can see how structured concurrency might not like that that. I think it is the best approach, though, given that this is really a missing feature in one implementation of the asyncio.EventLoop class, so fixing that seems to make the most sense at the EventLoop instance level, which would be consistent with all other asyncio.EventLoop implementations.