zeromq / libzmq

ZeroMQ core engine in C++, implements ZMTP/3.1
https://www.zeromq.org
Mozilla Public License 2.0
9.79k stars 2.36k forks source link

Problem: epoll crashes for some Windows users #4730

Closed minrk closed 3 months ago

minrk commented 3 months ago

Issue description

Reported in pyzmq, but after updating to libzmq 4.3.5, it appears the patch in #4422 did not fix the problem, but just shifted the error.

Environment

There seems to be some interaction with VPNs or firewalls or something that has yet to be fully understood, making it very hard to reproduce. It appears to be most of the time for affected users, but I've never been able to see it myself.

Minimal test code / Steps to reproduce the issue

Python:

import zmq
ctx = zmq.Context()
with ctx:
    with ctx.socket(zmq.PUSH) as s:
        s.bind("tcp://127.0.0.1:5555")

What's the actual result? (include assertion message & call stack if applicable)

If built with cmake defaults (epoll, ipc enabled), this crashes with:

Bad file descriptor (C:\Users\runneradmin\AppData\Local\Temp\tmppy2n81h4\build_deps\bundled_libzmq-src\src\epoll.cpp:73)

epoll.cpp:73

What's the expected result?

it doesn't crash.

minrk commented 3 months ago

I managed to reproduce this on the same system in the same env by adding a user with the username 日本語. So something in wepoll or libzmq (or Windows itself) is doing something weird that's sensitive to the username and/or home directory.

minrk commented 3 months ago

4732 fixes one cause of this error, but not all. This error is indeed seen when _wmkdir raises, which is fixed by #4732, but the more mysterious case seen in pyzmq appears to be caused even after a successful bind of the ipc socket, which I can't explain.