Open rmorshea opened 3 months ago
I don't think PUBHandler should accept async sockets, so the fix is probably to cast async sockets to sync sockets if they are given. The workaround is to do it yourself before passing to PUBHandler, I think.
Why shouldn't the PUBHandler accept an async socket? It seems like it would be valuable to avoid blocking calls to the socket whenever you hit log statements.
Why shouldn't the PUBHandler accept an async socket?
For one, PUB sockets effectively never block anyway. If they are backed up, they drop messages instead of blocking. Plus, in general, zmq sockets (at the libzmq level) are not threadsafe, so I think it is not actually safe to use PUBHandler with any logger that may be used concurrently from multiple threads with any kind of socket. In that way, I suppose using call_soon_threadsafe
might actually be safer than a sync socket. But then you'd need to be sure you are connecting to the right event loop, and I don't think PUBHandler has enough information to do that (e.g. consecutive calls to asyncio.run
which are in one thread, but two event loops).
Cases to consider:
__init__
called when no event loop is running, but emit
called when it is (common in application setup for log handlers)asyncio.run
, each of which create and destroy an event loopA separate AsyncPUBHandler that has declared semantics (i.e. attaches to the running loop at start and requires that the loop is running) could make sense, though, but I think it would cause problems in the first two cases to adopt that behavior in the base PUBHandler.
I think a separate AsyncPUBHandler
makes a lot of sense.
One way to address the first concern might be to copy asyncio.mixins._LoopBoundMixin
which implements the following in order to lazily determine the loop to use:
class _LoopBoundMixin:
_loop = None
def _get_loop(self):
loop = events._get_running_loop()
if self._loop is None:
with _global_lock:
if self._loop is None:
self._loop = loop
if loop is not self._loop:
raise RuntimeError(f'{self!r} is bound to a different event loop')
return loop
However, I could see this being a bit challenging to reason about since it wouldn't be immediately obvious what loop the handler would be bound to. The only way to find out would be to read the code to determine what loop is active when the first log is produced.
As a compromise, perhaps a loop
argument to AsyncPUBHandler
could accept:
AbstractEventLoop
- bind to the explicitly given loopNone
- bind to the currently running loop (probably the default)"lazy"
- bind to the currently running loop when the first log is handledThat sounds reasonable. Want to have a go? I'm hoping to release pyzmq 26 soon, and imagine 26.1 will likely come pretty soon after with fixes for issues that get reported with the new build system.
Great. Will try and take a crack at it when I find time.
This is a pyzmq bug
What pyzmq version?
22.3.0
What libzmq version?
4.3.4
Python version (and how it was installed)
Python 3.10 via conda-forge
OS
macOS 14
What happened?
If you attempt to send a message from a different thread that doesn't have an event loop running when using the
zqm.asyncio.Context
you'll get an error complain about missing event loop becauseSocket._Future()
requests access to the currently running event loop by default.In most cases this would be user error, but in the context of logging, it's quite hard to avoid. For example, when interacting with sync library code it's quite common to use
asyncio.to_thread
to prevent blocking calls. If the library you're calling in a thread then logs, you'll get an error because the thread spawned to do the work doesn't have an event loop.This has came up in several different ways for me. First when using
ddtrace
(a Datadog client library) and then later in my own library code.Code to reproduce bug
Traceback, if applicable
More info
I'm currently using the following as a workaround: