libuv / help

Need help with libuv? Post your question here.
28 stars 7 forks source link

libuv 1.18.0 uv_poll_start() crashed with `uv__io_stop: Assertion `loop->watchers[w->fd] == w' failed.', while libuv 1.34.2 not. #158

Closed leiless closed 4 years ago

leiless commented 4 years ago

Hi, libuv maintainers, I recently wrote a demo libuv program that uses uv_pipe_t for the IPC echo test.

In the demo program

1. [Main thread] Setup an IPC server, waiting for IPC clients.
2. [N threads] Set up an IPC client which connects to the IPC server in the main thread
3. Once N threads connected to the main thread, main thread `uv_write()` messages to those IPC clients.
4. N threads received those messages and print out to `stderr'
5. That all

The code can be found at gist.

When I run above code in Ubuntu 18.04 LTS, libuv(1.18.0) crashed:

$ ./test_program
20/07/11 16:16:47.887 +0800 | 1.18.0
20/07/11 16:16:47.888 +0800 | Waiting IPC clients...
test_program: src/unix/core.c:896: uv__io_stop: Assertion `loop->watchers[w->fd] == w' failed.
20/07/11 16:16:47.889 +0800 | All IPC clients connected
test_program: src/unix/core.c:896: uv__io_stop: Assertion `loop->watchers[w->fd] == w' failed.
[1]    30094 abort (core dumped)  ./test_program

$ lldb ./test_program
(lldb) target create "./test_program"
Current executable set to './test_program' (x86_64).
(lldb) run
Process 30040 launched: './test_program' (x86_64)
20/07/11 16:11:51.781 +0800 | 1.18.0
20/07/11 16:11:51.782 +0800 | Waiting IPC clients...
test_program: src/unix/core.c:896: uv__io_stop: Assertion `loop->watchers[w->fd] == w' failed.
Process 30040 stopped
* thread #2, name = 'test_program', stop reason = signal SIGABRT
    frame #0: 0x00007ffff77fcf47 libc.so.6`__GI_raise(sig=2) at raise.c:51
(lldb) bt
* thread #2, name = 'test_program', stop reason = signal SIGABRT
  * frame #0: 0x00007ffff77fcf47 libc.so.6`__GI_raise(sig=2) at raise.c:51
    frame #1: 0x00007ffff77fe8b1 libc.so.6`__GI_abort at abort.c:79
    frame #2: 0x00007ffff77ee42a libc.so.6`__assert_fail_base(fmt="%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion="loop->watchers[w->fd] == w", file="src/unix/core.c", line=896, function=<unavailable>) at assert.c:92
    frame #3: 0x00007ffff77ee4a2 libc.so.6`__GI___assert_fail(assertion=<unavailable>, file=<unavailable>, line=<unavailable>, function=<unavailable>) at assert.c:101
    frame #4: 0x00007ffff7bbbae9 libuv.so.1`uv__io_stop + 345
    frame #5: 0x00007ffff7bc1e06 libuv.so.1`___lldb_unnamed_symbol39$$libuv.so.1 + 22
    frame #6: 0x00007ffff7bc1f83 libuv.so.1`uv_poll_start + 35
    frame #7: 0x000055555555969d test_program`uv_util_poll_init_start(h=0x00007ffff71919b0, loop=0x00007ffff7191b80, fd=16, events=5, cb=(test_program`ipc_read_cb at test.c:22)) at uv_utils.c:28
    frame #8: 0x000055555555c089 test_program`ipc_client_connect_cb(req=0x00007ffff7191950, status=0) at test.c:74
    frame #9: 0x00007ffff7bc52c4 libuv.so.1`___lldb_unnamed_symbol61$$libuv.so.1 + 308
    frame #10: 0x00007ffff7bca340 libuv.so.1`uv__io_poll + 1024
    frame #11: 0x00007ffff7bbacc8 libuv.so.1`uv_run + 280
    frame #12: 0x000055555555c318 test_program`thread_cb(arg=0x0000000000000000) at test.c:96
    frame #13: 0x00007ffff739e6db libpthread.so.0`start_thread + 219
    frame #14: 0x00007ffff78dfa3f libc.so.6`__GI___clone at clone.S:95
(lldb) register r
General Purpose Registers:
       rax = 0x0000000000000000
       rbx = 0x0000000000000000
       rcx = 0x00007ffff77fcf47  libc.so.6`__GI_raise + 199 at raise.c:51
       rdx = 0x0000000000000000
       rdi = 0x0000000000000002
       rsi = 0x00007ffff718e3a0
       rbp = 0x00007ffff7975a38  
       rsp = 0x00007ffff718e3a0
        r8 = 0x0000000000000000
        r9 = 0x00007ffff718e3a0
       r10 = 0x0000000000000008
       r11 = 0x0000000000000246
       r12 = 0x00007ffff7bcd26d  
       r13 = 0x00007ffff7bcd324  
       r14 = 0x0000000000000380
       r15 = 0x00007ffff718e860
       rip = 0x00007ffff77fcf47  libc.so.6`__GI_raise + 199 at raise.c:51
    rflags = 0x0000000000000246
        cs = 0x0000000000000033
        fs = 0x0000000000000000
        gs = 0x0000000000000000
        ss = 0x000000000000002b
        ds = 0x0000000000000000
        es = 0x0000000000000000

The core function uv_poll_start failed, which upcalled from uv_util_poll_init_start():

/**
 * @return  0 if no error, otherwise uv error code.
 */
int uv_util_poll_init_start(
        poll_ctx_t *h,
        uv_loop_t *loop,
        int fd,
        int events,
        uv_poll_cb cb)
{
    assert_nonnull(h);
    assert_nonnull(loop);
    assert_ge(fd, 0, "%d");
    assert_nonnull(cb);

    h->loop = loop;
    h->fd = fd;
    h->events = events;
    h->cb = cb;

    int e = uv_poll_init(h->loop, &h->handle, h->fd);
    if (e == 0) {
        e = uv_poll_start(&h->handle, h->events, h->cb);
    }
    return e;
}

--

And then I run the code under Ubuntu 20.04 LTS, libuv 1.34.2, everything works fine

./test_program
20/07/11 16:14:29.803 +0800 | 1.34.2
20/07/11 16:14:29.803 +0800 | Waiting IPC clients...
20/07/11 16:14:29.804 +0800 | All IPC clients connected
++++++++++++++++++++++++++++++++++++++++++++++++++++++++...

--

Is this phenomena indicates that it's a bug in old libuv implementation? Or I just simply misused the libuv APIs?

leiless commented 4 years ago

According to uv_pipe_connect(), it'll poll UV_HANDLE_READABLE | UV_HANDLE_WRITABLE for given uv_pipe_t* handle, in our case, it's ipc_client:


  err = 0;
  if (new_sock) {
    err = uv__stream_open((uv_stream_t*)handle,
                          uv__stream_fd(handle),
                          UV_HANDLE_READABLE | UV_HANDLE_WRITABLE);
  }

  if (err == 0)
    uv__io_start(handle->loop, &handle->io_watcher, POLLOUT);

Later in our ipc_client_connect_cb(), we shouldn't poll the same fd(backed by ipc_client) more than once?

https://github.com/libuv/libuv/pull/2686

Creating a second poll handle over the same fd is not supported. Currently we do give an error if the handle is running,...

https://github.com/libuv/libuv/issues/2387

Are you using 2 different poll handles for the same fd?