apple / swift-nio

Event-driven network application framework for high performance protocol servers & clients, non-blocking.
https://swiftpackageindex.com/apple/swift-nio/documentation
Apache License 2.0
7.98k stars 650 forks source link

Too many calls to futex #127

Closed toffaletti closed 2 years ago

toffaletti commented 6 years ago

Expected behavior

I expect there not to be so many futex calls.

Actual behavior

916   20:12:30 epoll_create(128)        = 3 <0.000055>                                                                                                                         
916   20:12:30 futex(0x667b10, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000023>
916   20:12:30 futex(0x667b20, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000033>
916   20:12:30 eventfd2(0, EFD_CLOEXEC|EFD_NONBLOCK) = 4 <0.000064>
916   20:12:30 futex(0x667af0, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000032>
916   20:12:30 futex(0x667b00, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000022>
916   20:12:30 timerfd_create(CLOCK_MONOTONIC, TFD_CLOEXEC|TFD_NONBLOCK) = 5 <0.000058>
916   20:12:30 futex(0x667b60, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000022>
916   20:12:30 futex(0x667b80, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000033>
916   20:12:30 futex(0x667b90, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000032>
916   20:12:30 futex(0x667b30, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000023>
916   20:12:30 epoll_ctl(3, EPOLL_CTL_ADD, 4, {EPOLLIN|EPOLLERR|EPOLLRDHUP, {u32=4, u64=4}}) = 0 <0.000068>
916   20:12:30 futex(0x667ba0, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000020>
916   20:12:30 epoll_ctl(3, EPOLL_CTL_ADD, 5, {EPOLLIN|EPOLLERR|EPOLLRDHUP|EPOLLET, {u32=5, u64=5}}) = 0 <0.000053>
916   20:12:30 futex(0x6686c8, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000068>
916   20:12:30 futex(0x6680b0, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000056>
916   20:12:30 futex(0x7f81323e9530, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000064>
916   20:12:30 futex(0x7f81323e9928, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000068>

Steps to reproduce

  1. docker run --security-opt seccomp:unconfined -v$CWD:/code -it --rm swiftnio
  2. cd /code && swift build
  3. strace -s 1024 -fF -tT -o trace.out ./.build/x86_64-unknown-linux/debug/NIOHTTP1Server 0.0.0.0 8888 .

If possible, minimal yet complete reproducer code (or URL to code)

There are no futex calls here:

root@1ba3a0b6d2c0:/code# cat epoll.c 
#include <sys/epoll.h>

int main() {
    int fd = epoll_create(1);
    epoll_wait(fd, 0, 0, 0);
    return 0;
}
root@1ba3a0b6d2c0:/code# clang epoll.c 
root@1ba3a0b6d2c0:/code# strace -s 1024 -f -tT ./a.out 
21:46:15 execve("./a.out", ["./a.out"], [/* 11 vars */]) = 0 <0.001591>
21:46:15 brk(NULL)                      = 0x1c96000 <0.000028>
21:46:15 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) <0.000200>
21:46:15 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) <0.000012>
21:46:15 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 <0.000253>
21:46:15 fstat(3, {st_mode=S_IFREG|0644, st_size=18870, ...}) = 0 <0.000017>
21:46:15 mmap(NULL, 18870, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f74ce3e6000 <0.000023>
21:46:15 close(3)                       = 0 <0.000015>
21:46:15 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) <0.000036>
21:46:15 open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 <0.000022>
21:46:15 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\t\2\0\0\0\0\0@\0\0\0\0\0\0\0\270r\34\0\0\0\0\0\0\0\0\0@\0008\0\n\0@\0H\0G\0\6\0\0\0\5\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0000\2\0\0\0\0\0\0000\2\0\0\0\0\0\0\10\0\0\0\0\0\0\0\3\0\0\0\4\0\0\0`f\31\0\0\0\0\0`f\31\0\0\0\0\0`f\31\0\0\0\0\0\34\0\0\0\0\0\0\0\34\0\0\0\0\0\0\0\20\0\0\0\0\0\0\0\1\0\0\0\5\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\20\373\33\0\0\0\0\0\20\373\33\0\0\0\0\0\0\0 \0\0\0\0\0\1\0\0\0\6\0\0\0\300\7\34\0\0\0\0\0\300\7<\0\0\0\0\0\300\7<\0\0\0\0\0`O\0\0\0\0\0\0\340\221\0\0\0\0\0\0\0\0 \0\0\0\0\0\2\0\0\0\6\0\0\0\240;\34\0\0\0\0\0\240;<\0\0\0\0\0\240;<\0\0\0\0\0\340\1\0\0\0\0\0\0\340\1\0\0\0\0\0\0\10\0\0\0\0\0\0\0\4\0\0\0\4\0\0\0p\2\0\0\0\0\0\0p\2\0\0\0\0\0\0p\2\0\0\0\0\0\0D\0\0\0\0\0\0\0D\0\0\0\0\0\0\0\4\0\0\0\0\0\0\0\7\0\0\0\4\0\0\0\300\7\34\0\0\0\0\0\300\7<\0\0\0\0\0\300\7<\0\0\0\0\0\20\0\0\0\0\0\0\0x\0\0\0\0\0\0\0\10\0\0\0\0\0\0\0P\345td\4\0\0\0|f\31\0\0\0\0\0|f\31\0\0\0\0\0|f\31\0\0\0\0\0\274T\0\0\0\0\0\0\274T\0\0\0\0\0\0\4\0\0\0\0\0\0\0Q\345td\6\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\20\0\0\0\0\0\0\0R\345td\4\0\0\0\300\7\34\0\0\0\0\0\300\7<\0\0\0\0\0\300\7<\0\0\0\0\0@8\0\0\0\0\0\0@8\0\0\0\0\0\0\1\0\0\0\0\0\0\0\4\0\0\0\24\0\0\0\3\0\0\0GNU\0\2658\32Ey\6\322y\0078\"\245\316\262LK\376\371M\333\4\0\0\0\20\0\0\0\1\0\0\0GNU\0\0\0\0\0\2\0\0\0\6\0\0\0 \0\0\0\0\0\0\0\363\3\0\0\n\0\0\0\0\1\0\0\16\0\0\0\0000\20D\240 \2\1\210\3\346\220\305E\214\0\304\0\10\0\5\204\0`\300\200\0\r\212\f\0\4\20\0\210@2\10*@\210T<, \0162H&\204\300\214\4\10\0\2\2\16\241\254\32\4f\300\0\3002\0\300\0P\1 \201\10\204\v  ($\0\4 Z\0\20X\200\312DB(\0\6\200\20\30B\0 @\200\0IP\0Q\212@\22\0\0\0\0\10\0\0\21\20", 832) = 832 <0.000024>
21:46:15 fstat(3, {st_mode=S_IFREG|0755, st_size=1868984, ...}) = 0 <0.000013>
21:46:15 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f74ce3e5000 <0.000111>
21:46:15 mmap(NULL, 3971488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f74cddfc000 <0.000057>
21:46:15 mprotect(0x7f74cdfbc000, 2097152, PROT_NONE) = 0 <0.000041>
21:46:15 mmap(0x7f74ce1bc000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1c0000) = 0x7f74ce1bc000 <0.000044>
21:46:15 mmap(0x7f74ce1c2000, 14752, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f74ce1c2000 <0.000019>
21:46:15 close(3)                       = 0 <0.000013>
21:46:15 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f74ce3e4000 <0.000020>
21:46:15 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f74ce3e3000 <0.000011>
21:46:15 arch_prctl(ARCH_SET_FS, 0x7f74ce3e4700) = 0 <0.000159>
21:46:15 mprotect(0x7f74ce1bc000, 16384, PROT_READ) = 0 <0.000030>
21:46:15 mprotect(0x600000, 4096, PROT_READ) = 0 <0.000068>
21:46:15 mprotect(0x7f74ce3eb000, 4096, PROT_READ) = 0 <0.000024>
21:46:15 munmap(0x7f74ce3e6000, 18870)  = 0 <0.000019>
21:46:15 epoll_create(1)                = 3 <0.000044>
21:46:15 epoll_wait(3, NULL, 0, 0)      = -1 EINVAL (Invalid argument) <0.000015>
21:46:15 exit_group(0)                  = ?
21:46:15 +++ exited with 0 +++

SwiftNIO version/commit hash

c8d198020717b8bbb4c3be4e47c06aab4085c7fc

Swift & OS version (output of swift --version && uname -a)

Swift version 4.0 (swift-4.0-RELEASE) Target: x86_64-unknown-linux-gnu Linux 1ba3a0b6d2c0 4.9.60-linuxkit-aufs #1 SMP Mon Nov 6 16:00:12 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

toffaletti commented 6 years ago

I introduced pthreads just to be sure:

root@1ba3a0b6d2c0:/code# cat epoll.c 
#include <sys/epoll.h>
#include <pthread.h>

static void *start(void *arg) {
    for (int i=0; i<10; i++) {
        int fd = epoll_create(1);
        epoll_wait(fd, 0, 0, 0);
    }
    return 0;
}

int main() {
    pthread_t thread;
    pthread_create(&thread, 0, start, 0);
    pthread_join(thread, 0);
    return 0;
}
root@1ba3a0b6d2c0:/code# clang -pthread epoll.c 
root@1ba3a0b6d2c0:/code# strace -s 1024 -f -tT ./a.out
22:03:51 execve("./a.out", ["./a.out"], [/* 11 vars */]) = 0 <0.002773>
22:03:51 brk(NULL)                      = 0xb81000 <0.000168>
22:03:51 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) <0.000044>
22:03:51 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) <0.000124>
22:03:51 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 <0.000055>
22:03:51 fstat(3, {st_mode=S_IFREG|0644, st_size=18870, ...}) = 0 <0.000230>
22:03:51 mmap(NULL, 18870, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fce60f39000 <0.000470>
22:03:51 close(3)                       = 0 <0.000312>
22:03:51 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) <0.000311>
22:03:51 open("/lib/x86_64-linux-gnu/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3 <0.000384>
22:03:51 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260`\0\0\0\0\0\0@\0\0\0\0\0\0\0\210\23\2\0\0\0\0\0\0\0\0\0@\0008\0\t\0@\0)\0&\0\6\0\0\0\5\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0\370\1\0\0\0\0\0\0\370\1\0\0\0\0\0\0\10\0\0\0\0\0\0\0\3\0\0\0\4\0\0\0\2206\1\0\0\0\0\0\2206\1\0\0\0\0\0\2206\1\0\0\0\0\0\34\0\0\0\0\0\0\0\34\0\0\0\0\0\0\0\20\0\0\0\0\0\0\0\1\0\0\0\5\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\204y\1\0\0\0\0\0\204y\1\0\0\0\0\0\0\0 \0\0\0\0\0\1\0\0\0\6\0\0\0x{\1\0\0\0\0\0x{!\0\0\0\0\0x{!\0\0\0\0\0(\7\0\0\0\0\0\0\260H\0\0\0\0\0\0\0\0 \0\0\0\0\0\2\0\0\0\6\0\0\0P}\1\0\0\0\0\0P}!\0\0\0\0\0P}!\0\0\0\0\0000\2\0\0\0\0\0\0000\2\0\0\0\0\0\0\10\0\0\0\0\0\0\0\4\0\0\0\4\0\0\0008\2\0\0\0\0\0\0008\2\0\0\0\0\0\0008\2\0\0\0\0\0\0D\0\0\0\0\0\0\0D\0\0\0\0\0\0\0\4\0\0\0\0\0\0\0P\345td\4\0\0\0\2606\1\0\0\0\0\0\2606\1\0\0\0\0\0\2606\1\0\0\0\0\0\214\10\0\0\0\0\0\0\214\10\0\0\0\0\0\0\4\0\0\0\0\0\0\0Q\345td\6\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\20\0\0\0\0\0\0\0R\345td\4\0\0\0x{\1\0\0\0\0\0x{!\0\0\0\0\0x{!\0\0\0\0\0\210\4\0\0\0\0\0\0\210\4\0\0\0\0\0\0\1\0\0\0\0\0\0\0\4\0\0\0\24\0\0\0\3\0\0\0GNU\0\316\27\340#T\"e\374\21\331\274\217SK\264\360pI=0\4\0\0\0\20\0\0\0\1\0\0\0GNU\0\0\0\0\0\2\0\0\0\6\0\0\0 \0\0\0\0\0\0\0\345\1\0\0W\0\0\0 \0\0\0\v\0\0\0\31#\2\261\1\10\20\2@@a\370\3\10\10\25\200 \0\0\0\0\200\300\321Q\0\0\0\22\353\3020D\0\10\20A\0\2\0\2\f\1\200\v\221\1\330\240\r\240@\230 \244\200\21\n\202-l@g\214V\24\0\224 \200$H\200P(\1\22\f\311B\240\220\22\10\f \2ZdA\245c\4@\n\n\n\0\2009\1(\314D\204\201\300\22\10(\fD\0\0\0\200Q\10\200\35\4B\320\2608A\0\1\0\0\265\0300\0\200`\2\20\"\0\tA\20\1\5\0P(\251\22G(\0\0\202\4\230@\4\0\20\340T\0\2@\2\2\20\3010f\26\200\0", 832) = 832 <0.000511>
22:03:51 fstat(3, {st_mode=S_IFREG|0755, st_size=138696, ...}) = 0 <0.000057>
22:03:51 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fce60f38000 <0.000038>
22:03:51 mmap(NULL, 2212904, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fce60afc000 <0.000175>
22:03:51 mprotect(0x7fce60b14000, 2093056, PROT_NONE) = 0 <0.000482>
22:03:51 mmap(0x7fce60d13000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x17000) = 0x7fce60d13000 <0.000815>
22:03:51 mmap(0x7fce60d15000, 13352, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fce60d15000 <0.000049>
22:03:51 close(3)                       = 0 <0.000236>
22:03:51 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) <0.000034>
22:03:51 open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 <0.000046>
22:03:51 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\t\2\0\0\0\0\0@\0\0\0\0\0\0\0\270r\34\0\0\0\0\0\0\0\0\0@\0008\0\n\0@\0H\0G\0\6\0\0\0\5\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0000\2\0\0\0\0\0\0000\2\0\0\0\0\0\0\10\0\0\0\0\0\0\0\3\0\0\0\4\0\0\0`f\31\0\0\0\0\0`f\31\0\0\0\0\0`f\31\0\0\0\0\0\34\0\0\0\0\0\0\0\34\0\0\0\0\0\0\0\20\0\0\0\0\0\0\0\1\0\0\0\5\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\20\373\33\0\0\0\0\0\20\373\33\0\0\0\0\0\0\0 \0\0\0\0\0\1\0\0\0\6\0\0\0\300\7\34\0\0\0\0\0\300\7<\0\0\0\0\0\300\7<\0\0\0\0\0`O\0\0\0\0\0\0\340\221\0\0\0\0\0\0\0\0 \0\0\0\0\0\2\0\0\0\6\0\0\0\240;\34\0\0\0\0\0\240;<\0\0\0\0\0\240;<\0\0\0\0\0\340\1\0\0\0\0\0\0\340\1\0\0\0\0\0\0\10\0\0\0\0\0\0\0\4\0\0\0\4\0\0\0p\2\0\0\0\0\0\0p\2\0\0\0\0\0\0p\2\0\0\0\0\0\0D\0\0\0\0\0\0\0D\0\0\0\0\0\0\0\4\0\0\0\0\0\0\0\7\0\0\0\4\0\0\0\300\7\34\0\0\0\0\0\300\7<\0\0\0\0\0\300\7<\0\0\0\0\0\20\0\0\0\0\0\0\0x\0\0\0\0\0\0\0\10\0\0\0\0\0\0\0P\345td\4\0\0\0|f\31\0\0\0\0\0|f\31\0\0\0\0\0|f\31\0\0\0\0\0\274T\0\0\0\0\0\0\274T\0\0\0\0\0\0\4\0\0\0\0\0\0\0Q\345td\6\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\20\0\0\0\0\0\0\0R\345td\4\0\0\0\300\7\34\0\0\0\0\0\300\7<\0\0\0\0\0\300\7<\0\0\0\0\0@8\0\0\0\0\0\0@8\0\0\0\0\0\0\1\0\0\0\0\0\0\0\4\0\0\0\24\0\0\0\3\0\0\0GNU\0\2658\32Ey\6\322y\0078\"\245\316\262LK\376\371M\333\4\0\0\0\20\0\0\0\1\0\0\0GNU\0\0\0\0\0\2\0\0\0\6\0\0\0 \0\0\0\0\0\0\0\363\3\0\0\n\0\0\0\0\1\0\0\16\0\0\0\0000\20D\240 \2\1\210\3\346\220\305E\214\0\304\0\10\0\5\204\0`\300\200\0\r\212\f\0\4\20\0\210@2\10*@\210T<, \0162H&\204\300\214\4\10\0\2\2\16\241\254\32\4f\300\0\3002\0\300\0P\1 \201\10\204\v  ($\0\4 Z\0\20X\200\312DB(\0\6\200\20\30B\0 @\200\0IP\0Q\212@\22\0\0\0\0\10\0\0\21\20", 832) = 832 <0.000021>
22:03:51 fstat(3, {st_mode=S_IFREG|0755, st_size=1868984, ...}) = 0 <0.000020>
22:03:51 mmap(NULL, 3971488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fce60732000 <0.000143>
22:03:51 mprotect(0x7fce608f2000, 2097152, PROT_NONE) = 0 <0.000027>
22:03:51 mmap(0x7fce60af2000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1c0000) = 0x7fce60af2000 <0.000054>
22:03:51 mmap(0x7fce60af8000, 14752, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fce60af8000 <0.000022>
22:03:51 close(3)                       = 0 <0.000026>
22:03:51 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fce60f37000 <0.000024>
22:03:51 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fce60f36000 <0.000022>
22:03:51 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fce60f35000 <0.000101>
22:03:51 arch_prctl(ARCH_SET_FS, 0x7fce60f36700) = 0 <0.000019>
22:03:51 mprotect(0x7fce60af2000, 16384, PROT_READ) = 0 <0.000046>
22:03:51 mprotect(0x7fce60d13000, 4096, PROT_READ) = 0 <0.000022>
22:03:51 mprotect(0x600000, 4096, PROT_READ) = 0 <0.000033>
22:03:51 mprotect(0x7fce60f3e000, 4096, PROT_READ) = 0 <0.000190>
22:03:51 munmap(0x7fce60f39000, 18870)  = 0 <0.000103>
22:03:51 set_tid_address(0x7fce60f369d0) = 1750 <0.000011>
22:03:51 set_robust_list(0x7fce60f369e0, 24) = 0 <0.000014>
22:03:51 rt_sigaction(SIGRTMIN, {0x7fce60b01b50, [], SA_RESTORER|SA_SIGINFO, 0x7fce60b0d390}, NULL, 8) = 0 <0.000012>
22:03:51 rt_sigaction(SIGRT_1, {0x7fce60b01be0, [], SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x7fce60b0d390}, NULL, 8) = 0 <0.000010>
22:03:51 rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0 <0.000083>
22:03:51 getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0 <0.000023>
22:03:51 mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7fce5ff31000 <0.000014>
22:03:51 brk(NULL)                      = 0xb81000 <0.000011>
22:03:51 brk(0xba2000)                  = 0xba2000 <0.000024>
22:03:51 mprotect(0x7fce5ff31000, 4096, PROT_NONE) = 0 <0.000091>
22:03:51 clone(child_stack=0x7fce60730ff0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7fce607319d0, tls=0x7fce60731700, child_tidptr=0x7fce607319d0) = 1751 <0.000202>
22:03:51 futex(0x7fce607319d0, FUTEX_WAIT, 1751, NULLstrace: Process 1751 attached
 <unfinished ...>
[pid  1751] 22:03:51 set_robust_list(0x7fce607319e0, 24) = 0 <0.000111>
[pid  1751] 22:03:51 epoll_create(1)    = 3 <0.000179>
[pid  1751] 22:03:51 epoll_wait(3, NULL, 0, 0) = -1 EINVAL (Invalid argument) <0.000162>
[pid  1751] 22:03:51 epoll_create(1)    = 4 <0.000238>
[pid  1751] 22:03:51 epoll_wait(4, NULL, 0, 0) = -1 EINVAL (Invalid argument) <0.000166>
[pid  1751] 22:03:51 epoll_create(1)    = 5 <0.000231>
[pid  1751] 22:03:51 epoll_wait(5, NULL, 0, 0) = -1 EINVAL (Invalid argument) <0.000163>
[pid  1751] 22:03:51 epoll_create(1)    = 6 <0.000625>
[pid  1751] 22:03:51 epoll_wait(6, NULL, 0, 0) = -1 EINVAL (Invalid argument) <0.000202>
[pid  1751] 22:03:51 epoll_create(1)    = 7 <0.000158>
[pid  1751] 22:03:51 epoll_wait(7, NULL, 0, 0) = -1 EINVAL (Invalid argument) <0.000375>
[pid  1751] 22:03:51 epoll_create(1)    = 8 <0.000150>
[pid  1751] 22:03:51 epoll_wait(8, NULL, 0, 0) = -1 EINVAL (Invalid argument) <0.000187>
[pid  1751] 22:03:51 epoll_create(1)    = 9 <0.000143>
[pid  1751] 22:03:51 epoll_wait(9, NULL, 0, 0) = -1 EINVAL (Invalid argument) <0.000129>
[pid  1751] 22:03:51 epoll_create(1)    = 10 <0.000210>
[pid  1751] 22:03:51 epoll_wait(10, NULL, 0, 0) = -1 EINVAL (Invalid argument) <0.000217>
[pid  1751] 22:03:51 epoll_create(1)    = 11 <0.000163>
[pid  1751] 22:03:51 epoll_wait(11, NULL, 0, 0) = -1 EINVAL (Invalid argument) <0.000197>
[pid  1751] 22:03:51 epoll_create(1)    = 12 <0.000102>
[pid  1751] 22:03:51 epoll_wait(12, NULL, 0, 0) = -1 EINVAL (Invalid argument) <0.000169>
[pid  1751] 22:03:51 madvise(0x7fce5ff31000, 8368128, MADV_DONTNEED) = 0 <0.000105>
[pid  1751] 22:03:51 exit(0)            = ?
[pid  1750] 22:03:51 <... futex resumed> ) = 0 <0.015823>
[pid  1750] 22:03:51 exit_group(0)      = ?
[pid  1751] 22:03:51 +++ exited with 0 +++
22:03:51 +++ exited with 0 +++

The futex here is the pthread_join.

toffaletti commented 6 years ago

I have a minimal reproducer in a branch https://github.com/toffaletti/swift-nio/blob/wtfutex/Sources/WTFutex/main.swift

toffaletti commented 6 years ago

Got it:

root@1ba3a0b6d2c0:/code# gdb ./.build/x86_64-unknown-linux/debug/WTFutex
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./.build/x86_64-unknown-linux/debug/WTFutex...done.
(gdb) catch syscall 202
...
Catchpoint 1 (returned from syscall futex), 0x00007ffff7414ac0 in futex_wake (private=0, processes_to_wake=2147483647, 
    futex_word=0x5da878 <globalinit_33_355B02204025CF64BE88BEACDD727F65_token2>) at ../sysdeps/unix/sysv/linux/futex-internal.h:231
231 in ../sysdeps/unix/sysv/linux/futex-internal.h
(gdb) bt
#0  0x00007ffff7414ac0 in futex_wake (private=0, processes_to_wake=2147483647, futex_word=0x5da878 <globalinit_33_355B02204025CF64BE88BEACDD727F65_token2>)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:231
#1  __pthread_once_slow (once_control=0x5da878 <globalinit_33_355B02204025CF64BE88BEACDD727F65_token2>, init_routine=0x7ffff6d34ac0 <__once_proxy>) at pthread_once.c:127
#2  0x00007ffff7a4397f in swift_once () from /usr/lib/swift/linux/libswiftCore.so
#3  0x0000000000558727 in _T07WTFutex7EventFdO11EFD_CLOEXECSifau () at /code/Sources/WTFutex/main.swift:66
#4  0x0000000000557464 in main () at /code/Sources/WTFutex/main.swift:126

WTFutex/main.swift:66

internal enum EventFd {
    public static let EFD_CLOEXEC = CNIOLinux.EFD_CLOEXEC <--

Every static let is causing a futex from pthread_once to support once semantics. This seems non-optimal for direct assignments.

toffaletti commented 6 years ago

@jckarter @gparker42 Anything to be done about this? SwiftNIO is using static let all over the place wanting something like constexpr but incurring cost of swift::swift_once on every access.

https://github.com/apple/swift/blob/3d178be169dd42b85f505ee1502232610535e4a3/stdlib/public/runtime/Once.cpp#L49

toffaletti commented 6 years ago

swift build -Xswiftc '-whole-module-optimization' -Xswiftc '-O'

Optimization seems to eliminate all futex calls in the reduced repro, but I still see a lot of futex calls in NIOHTTP1Server responding to a request:

1400  05:03:56 epoll_ctl(6, EPOLL_CTL_ADD, 10, {EPOLLIN|EPOLLERR|EPOLLRDHUP, {u32=10, u64=10}}) = 0 <0.000016>
1400  05:03:56 epoll_wait(6, [{EPOLLIN, {u32=10, u64=10}}], 64, -1) = 1 <0.000013>                                                                                             
1400  05:03:56 futex(0x55f11a5fca60, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000013>
1400  05:03:56 read(10, "GET / HTTP/1.1\r\nHost: localhost:8888\r\nUser-Agent: curl/7.47.0\r\nAccept: */*\r\n\r\n", 1024) = 78 <0.000017>
1400  05:03:56 futex(0x7f6818b387f8, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000012>
1400  05:03:56 mprotect(0x7f6808021000, 24576, PROT_READ|PROT_WRITE) = 0 <0.000016>
1400  05:03:56 futex(0x7f6818b387e8, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000012>
1400  05:03:56 futex(0x7f6818b05a10, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000012>
1400  05:03:56 futex(0x55f11a5fc978, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000012>
1400  05:03:56 futex(0x55f11a5fca58, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000011>
1400  05:03:56 writev(10, [{"HTTP/1.1 200 OK\r\ncontent-length: 12\r\n\r\n", 39}, {"Hello World!", 12}], 2) = 51 <0.000069>
1400  05:03:56 epoll_wait(6, [{EPOLLIN|EPOLLRDHUP, {u32=10, u64=10}}], 64, -1) = 1 <0.000017>
1400  05:03:56 read(10, "", 1024)       = 0 <0.000014>
1400  05:03:56 futex(0x55f11a5fc9d0, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000012>
1400  05:03:56 epoll_ctl(6, EPOLL_CTL_DEL, 10, 0x7f68149138f0) = 0 <0.000014>
1400  05:03:56 futex(0x55f11a5fc9f0, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000012>
1400  05:03:56 close(10)                = 0 <0.000070>
1400  05:03:56 futex(0x55f11a5f12a0, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000012>
1400  05:03:56 clock_gettime(CLOCK_MONOTONIC, {318156, 179407969}) = 0 <0.000015>
toffaletti commented 6 years ago

Looks like swift_getGenericMetadata is causing a futex instantiating public final class HTTPRequestDecoder: HTTPDecoder<HTTPServerRequestPart>:

Thread 2 "NIO-ELT-#0" hit Catchpoint 4 (call to syscall futex), 0x00007ffff7bc8ac0 in futex_wake (private=0, processes_to_wake=2147483647, 
    futex_word=0x5555556f83c8 <_T08NIOHTTP18HTTPPartOMP+40>) at ../sysdeps/unix/sysv/linux/futex-internal.h:231
231 ../sysdeps/unix/sysv/linux/futex-internal.h: No such file or directory.
(gdb) bt
#0  0x00007ffff7bc8ac0 in futex_wake (private=0, processes_to_wake=2147483647, futex_word=0x5555556f83c8 <_T08NIOHTTP18HTTPPartOMP+40>)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:231
#1  __pthread_once_slow (once_control=0x5555556f83c8 <_T08NIOHTTP18HTTPPartOMP+40>, init_routine=0x7ffff6dfaac0 <__once_proxy>) at pthread_once.c:127
#2  0x00007ffff7680b4a in swift_getGenericMetadata () from /usr/lib/swift/linux/libswiftCore.so
#3  0x0000555555656a28 in _T08NIOHTTP18HTTPPartOMa ()
#4  0x000055555563b091 in ?? ()
#5  0x00007ffff7bc8a99 in __pthread_once_slow (once_control=0x555555702c18, init_routine=0x7ffff6dfaac0 <__once_proxy>) at pthread_once.c:116
#6  0x00007ffff768916f in swift_once () from /usr/lib/swift/linux/libswiftCore.so
#7  0x00005555556367e9 in _T08NIOHTTP118HTTPRequestDecoderCACycfC () at /code/Sources/NIOHTTP1/HTTPDecoder.swift:121
#8  _T03NIO15ChannelPipelineC8NIOHTTP1E21addHTTPServerHandlersAA15EventLoopFutureCyytGSb5first_tF (first=..., self=...) at /code/Sources/NIOHTTP1/HTTPDecoder.swift:121
#9  0x0000555555663214 in _T014NIOHTTP1Server3NIO15EventLoopFutureCyytGAB7Channel_pcfU1_ () at /code/Sources/NIOHTTP1Server/main.swift:401
#10 0x0000555555599c5e in _T03NIO7Channel_pAA15EventLoopFutureCyytGIegxo_AaB_pAEIegir_TRTA () at /code/Sources/NIO/EventLoop.swift:45
#11 0x000055555559b889 in _T03NIO7Channel_pAA15EventLoopFutureCyytGIegxo_AaB_pAEIegir_TRTA.87 ()
#12 0x0000555555599c94 in _T03NIO7Channel_pAA15EventLoopFutureCyytGIegir_AaB_pAEIegxo_TRTA () at /code/Sources/NIO/EventLoop.swift:45
#13 0x0000555555599ccf in _T03NIO15ServerBootstrapC13AcceptHandler33_C131C0126670CF68D8B594DDFAE0CE57LLC11channelReadyAA07ChannelE7ContextC3ctx_AA6NIOAnyV4datatFAA15EventLoopFutureCyytGycfU0_TA () at /code/Sources/NIO/Bootstrap.swift:208
#14 0x0000555555599cf1 in _T03NIO15EventLoopFutureCyytGIego_ytADIegio_TRTA () at /code/Sources/NIO/EventLoop.swift:45
#15 0x00005555555d7cf0 in _T03NIO15EventLoopFutureC4thenACyqd__Gs12StaticStringV4file_Su4lineAExctlFAA12CallbackList33_1CAEE0056AB83634CE1A29DE57BC300CLLVycfU_yt_ytTg5 ()
    at /code/Sources/NIO/EventLoopFuture.swift:385
#16 0x0000555555599d19 in _T03NIO15EventLoopFutureC4thenACyqd__Gs12StaticStringV4file_Su4lineAExctlFAA12CallbackList33_1CAEE0056AB83634CE1A29DE57BC300CLLVycfU_yt_ytTg5TA ()
    at /code/Sources/NIO/EventLoop.swift:45
#17 0x00005555555ddc9f in _T03NIO12CallbackList33_1CAEE0056AB83634CE1A29DE57BC300CLLV4_runyyFTf4x_n () at /code/Sources/NIO/EventLoopFuture.swift:82
#18 0x00005555555d6f44 in _T03NIO16EventLoopPromiseV8_resolve33_1CAEE0056AB83634CE1A29DE57BC300CLLyAA0bC11FutureValueAELLOyxG5value_tFyycfU_Sb_Tg5Tm ()
    at /code/Sources/NIO/EventLoopFuture.swift:620
#19 0x00005555555d6de0 in _T03NIO16EventLoopPromiseV8_resolve33_1CAEE0056AB83634CE1A29DE57BC300CLLyAA0bC11FutureValueAELLOyxG5value_tFyycfU_Sb_Tg5 ()
    at /code/Sources/NIO/EventLoopFuture.swift:620
#20 0x00005555555e77e1 in _T03NIO16EventLoopPromiseV8_resolve33_1CAEE0056AB83634CE1A29DE57BC300CLLyAA0bC11FutureValueAELLOyxG5value_tFyycfU_yt_Tg5TA ()
#21 0x00005555555c1d8c in _T0Ieg_ytytIegir_TR ()
#22 0x00005555555cdab1 in _T0Ieg_ytytIegir_TRTA ()
#23 0x00005555555c903e in _T03NIO19withAutoReleasePool33_D5D78C61B22284700B9BD1ACFBC25157LLxxyKcKlFyt_Tg5 () at /code/Sources/NIO/EventLoop.swift:520
#24 _T03NIO19SelectableEventLoopC3runyyKF (self=...) at /code/Sources/NIO/EventLoop.swift:527
#25 0x00005555555d109a in _T03NIO27MultiThreadedEventLoopGroupC014setupThreadAnddE033_D5D78C61B22284700B9BD1ACFBC25157LLAA010SelectabledE0CSS4name_yAA0H0Cc11initializertFZyAJcfU_Tf4gnnnn_n () at /code/Sources/NIO/EventLoop.swift:667
#26 0x00005555555d32a2 in _T03NIO27MultiThreadedEventLoopGroupC014setupThreadAnddE033_D5D78C61B22284700B9BD1ACFBC25157LLAA010SelectabledE0CSS4name_yAA0H0Cc11initializertFZyAJcfU_TA () at /code/Sources/NIO/Utilities.swift:34
#27 0x000055555562b11f in _T03NIO6ThreadCIegx_ACytIegir_TR ()
#28 0x00005555555d32c1 in _T03NIO6ThreadCIegx_ACytIegir_TRTA () at /code/Sources/NIO/Utilities.swift:34
#29 0x000055555562b58c in _T03NIO6ThreadC11spawnAndRunySSSg4name_yACc4bodytFZSvSgAHcfU_ () at /code/Sources/NIO/Thread.swift:105
#30 0x000055555562b5c9 in _T03NIO6ThreadC11spawnAndRunySSSg4name_yACc4bodytFZSvSgAHcfU_To ()
#31 0x00007ffff7bc16ba in start_thread (arg=0x7ffff3d8f700) at pthread_create.c:333
#32 0x00007ffff656141d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

The rest seem to be in SwiftNIO from the lock protecting SelectableEventLoop.scheduledTasks:

Thread 2 "NIO-ELT-#0" hit Catchpoint 4 (returned from syscall futex), __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
135 ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file or directory.
(gdb) bt
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007ffff7bc3dbd in __GI___pthread_mutex_lock (mutex=0x7fffe4000a10) at ../nptl/pthread_mutex_lock.c:80
#2  0x00005555556341dd in _T021NIOConcurrencyHelpers4LockC4lockyyFTm () at /code/Sources/NIOConcurrencyHelpers/lock.swift:45
#3  0x00005555556341a0 in _T021NIOConcurrencyHelpers4LockC4lockyyF () at /code/Sources/NIOConcurrencyHelpers/lock.swift:45
#4  0x00005555555cbc29 in _T03NIO19SelectableEventLoopC9schedule033_D5D78C61B22284700B9BD1ACFBC25157LLyAA13ScheduledTaskAELLCF (task=..., self=...)
    at /code/Sources/NIO/EventLoop.swift:402
#5  _T03NIO19SelectableEventLoopC7executeyyycFTf4gn_n (task=..., self=...) at /code/Sources/NIO/EventLoop.swift:395
#6  0x00005555555c9d8e in _T03NIO19SelectableEventLoopC7executeyyycF ()
#7  _T03NIO19SelectableEventLoopCAA0cD0A2aDP7executeyyycFTW () at /code/Sources/NIO/EventLoop.swift:394
#8  0x00005555555df762 in _T03NIO15EventLoopFutureC13_whenComplete33_1CAEE0056AB83634CE1A29DE57BC300CLLyAA12CallbackListAELLVycFyt_Tg5Tf4gn_n (callback=...)
    at /code/Sources/NIO/EventLoopFuture.swift:546
#9  _T03NIO15EventLoopFutureC13_whenComplete33_1CAEE0056AB83634CE1A29DE57BC300CLLyAA12CallbackListAELLVycFyt_Tg5 (callback=...) at /code/Sources/NIO/EventLoopFuture.swift:542
#10 _T03NIO15EventLoopFutureC11whenSuccessyyxcFyt_Tg5 () at /code/Sources/NIO/EventLoopFuture.swift:570
#11 _T03NIO15EventLoopFutureC3mapACyqd__Gs12StaticStringV4file_Su4lineqd__xctlFyt_ytTg5Tf4nngn_n (file=..., line=..., callback=..., self=...)
    at /code/Sources/NIO/EventLoopFuture.swift:474
#12 0x0000555555620bd8 in _T03NIO15EventLoopFutureC3mapACyqd__Gs12StaticStringV4file_Su4lineqd__xctlFyt_ytTg5 () at /code/Sources/NIO/SocketChannel.swift:1088
#13 _T03NIO19ServerSocketChannelC12channelRead0yAA6NIOAnyVF (data=...) at /code/Sources/NIO/SocketChannel.swift:1088
#14 0x000055555561d51e in _T03NIO17BaseSocketChannelCyxGAA0D4CoreA2aEP12channelRead0yAA6NIOAnyVFTW ()
#15 0x00005555555b06e0 in _T03NIO18TailChannelHandler33_EEC863903996E9F191EBAFEB0FB0DFDDLLC11channelReadyAA0cD7ContextC3ctx_AA6NIOAnyV4datatF (ctx=..., data=...)
    at /code/Sources/NIO/ChannelPipeline.swift:811
#16 _T03NIO18TailChannelHandler33_EEC863903996E9F191EBAFEB0FB0DFDDLLCAA01_c7InboundD0A2aEP11channelReadyAA0cD7ContextC3ctx_AA6NIOAnyV4datatFTW ()
    at /code/Sources/NIO/ChannelPipeline.swift:810
#17 0x00005555555af80c in _T03NIO21ChannelHandlerContextC06invokeB4Read33_EEC863903996E9F191EBAFEB0FB0DFDDLLyAA6NIOAnyVF (data=..., self=...)
    at /code/Sources/NIO/ChannelPipeline.swift:1096
#18 0x0000555555599dbd in _T03NIO15ServerBootstrapC13AcceptHandler33_C131C0126670CF68D8B594DDFAE0CE57LLC11channelReadyAA07ChannelE7ContextC3ctx_AA6NIOAnyV4datatFyycfU1_TA ()
    at /code/Sources/NIO/ChannelPipeline.swift:909
#19 0x00005555555c1d8c in _T0Ieg_ytytIegir_TR ()
#20 0x0000555555599df1 in _T0Ieg_ytytIegir_TRTA () at /code/Sources/NIO/EventLoop.swift:45
#21 0x00005555555e7881 in _T0xytIegir_xIegi_r__lTRyt_3NIO7Channel_pTG5TA ()
#22 0x00005555555e8869 in _T0xytIegir_xIegi_r__lTRyt_ytTG5TA ()
#23 0x00005555555e6e52 in _T03NIO15EventLoopFutureC11whenSuccessyyxcFAA12CallbackList33_1CAEE0056AB83634CE1A29DE57BC300CLLVycfU_yt_Tg5TA ()
    at /code/Sources/NIO/EventLoopFuture.swift:572
#24 0x00005555555a822f in _T03NIO12CallbackList33_1CAEE0056AB83634CE1A29DE57BC300CLLVIego_ytADIegir_TR ()
#25 0x00005555555e1441 in _T03NIO12CallbackList33_1CAEE0056AB83634CE1A29DE57BC300CLLVIego_ytADIegir_TRTA ()
#26 0x00005555555e7fb9 in _T03NIO12CallbackList33_1CAEE0056AB83634CE1A29DE57BC300CLLVIego_ytADIegir_TRTA.692 ()
#27 0x00005555555dda80 in _T03NIO12CallbackList33_1CAEE0056AB83634CE1A29DE57BC300CLLV4_runyyFTf4x_n () at /code/Sources/NIO/EventLoopFuture.swift:106
#28 0x00005555555d6f44 in _T03NIO16EventLoopPromiseV8_resolve33_1CAEE0056AB83634CE1A29DE57BC300CLLyAA0bC11FutureValueAELLOyxG5value_tFyycfU_Sb_Tg5Tm ()
    at /code/Sources/NIO/EventLoopFuture.swift:620
#29 0x00005555555d6de0 in _T03NIO16EventLoopPromiseV8_resolve33_1CAEE0056AB83634CE1A29DE57BC300CLLyAA0bC11FutureValueAELLOyxG5value_tFyycfU_Sb_Tg5 ()
    at /code/Sources/NIO/EventLoopFuture.swift:620
#30 0x00005555555e77e1 in _T03NIO16EventLoopPromiseV8_resolve33_1CAEE0056AB83634CE1A29DE57BC300CLLyAA0bC11FutureValueAELLOyxG5value_tFyycfU_yt_Tg5TA ()
#31 0x00005555555c1d8c in _T0Ieg_ytytIegir_TR ()
#32 0x00005555555cdab1 in _T0Ieg_ytytIegir_TRTA ()
#33 0x00005555555c903e in _T03NIO19withAutoReleasePool33_D5D78C61B22284700B9BD1ACFBC25157LLxxyKcKlFyt_Tg5 () at /code/Sources/NIO/EventLoop.swift:520
#34 _T03NIO19SelectableEventLoopC3runyyKF (self=...) at /code/Sources/NIO/EventLoop.swift:527
---Type <return> to continue, or q <return> to quit---
#35 0x00005555555d109a in _T03NIO27MultiThreadedEventLoopGroupC014setupThreadAnddE033_D5D78C61B22284700B9BD1ACFBC25157LLAA010SelectabledE0CSS4name_yAA0H0Cc11initializertFZyAJcfU_Tf4gnnnn_n () at /code/Sources/NIO/EventLoop.swift:667
#36 0x00005555555d32a2 in _T03NIO27MultiThreadedEventLoopGroupC014setupThreadAnddE033_D5D78C61B22284700B9BD1ACFBC25157LLAA010SelectabledE0CSS4name_yAA0H0Cc11initializertFZyAJcfU_TA () at /code/Sources/NIO/Utilities.swift:34
#37 0x000055555562b11f in _T03NIO6ThreadCIegx_ACytIegir_TR ()
#38 0x00005555555d32c1 in _T03NIO6ThreadCIegx_ACytIegir_TRTA () at /code/Sources/NIO/Utilities.swift:34
#39 0x000055555562b58c in _T03NIO6ThreadC11spawnAndRunySSSg4name_yACc4bodytFZSvSgAHcfU_ () at /code/Sources/NIO/Thread.swift:105
#40 0x000055555562b5c9 in _T03NIO6ThreadC11spawnAndRunySSSg4name_yACc4bodytFZSvSgAHcfU_To ()
#41 0x00007ffff7bc16ba in start_thread (arg=0x7ffff3d8f700) at pthread_create.c:333
#42 0x00007ffff656141d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
(gdb) bt
#0  __lll_unlock_wake () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:371
#1  0x00007ffff7bc54ff in __pthread_mutex_unlock_usercnt (decr=1, mutex=0x7fffe4000a10) at pthread_mutex_unlock.c:55
#2  __GI___pthread_mutex_unlock (mutex=0x7fffe4000a10) at pthread_mutex_unlock.c:314
#3  0x00005555556341dd in _T021NIOConcurrencyHelpers4LockC4lockyyFTm () at /code/Sources/NIOConcurrencyHelpers/lock.swift:45
#4  0x00005555556341c0 in _T021NIOConcurrencyHelpers4LockC6unlockyyF () at /code/Sources/NIOConcurrencyHelpers/lock.swift:45
#5  0x00005555555cbc8d in _T03NIO19SelectableEventLoopC9schedule033_D5D78C61B22284700B9BD1ACFBC25157LLyAA13ScheduledTaskAELLCF (task=..., self=...)
    at /code/Sources/NIO/EventLoop.swift:404
Lukasa commented 6 years ago

This is definitely worth diving into to see if we can elide the calls to futex, but I don't think this is the highest priority right now: in general futexes are fast, particularly where there is relatively little contention on them as in the case of a channel handler init.

I'm going to reach out to some folks on the Swift team and see if they have any particular thoughts on whether we can avoid this, but I suspect it's basically unavoidable for generic classes like HTTPDecoder.

Lukasa commented 6 years ago

(Naturally it goes without saying that the futex around scheduledTasks is unavoidable in the current implementation. At some point we'll want to investigate the performance profile of a lock-free heap, but that's not necessarily a guaranteed win either.)

jckarter commented 6 years ago

Does the futex happen on every access, or only the first? We could consider using dispatch_once for one-time initializations like we do on Darwin, which should not have any barrier after the initialization happens. If you're writing static lets that are constants, and the optimizer is not turning them into static initializations, then please file optimizer bugs too.

gparker42 commented 6 years ago

Yes, it would help if you could distinguish between futex accesses during launch time or at first use, versus futex accesses during steady-state operation. Both are worth investigating to improve performance, but the latter would be a more significant problem.

toffaletti commented 6 years ago

Thanks, I will work on teasing those apart. The issue I'm running into now is that I have to use an optimized build which makes debugging a bit more difficult and I haven't figured out a way to make gdb only pause execution on the futex calls in the thread I care about and pausing on all threads calls to futex seems to introduce different behavior because of timeouts and various timing issues. I might need to resort to a different tool.

weissi commented 5 years ago

We should look into this

allright commented 5 years ago

I think this is the over side of this issue: https://github.com/apple/swift-nio/issues/894

weissi commented 2 years ago

@Lukasa et al: I think we can close this issue. I recently made some perf analysis at work and I counted every single syscall in a program that doesn't use wait().

Program

let group = MultiThreadedEventLoopGroup(numberOfThreads: Int(CommandLine.arguments.dropFirst().first!)!)
defer {
    try! group.syncShutdownGracefully()
}

let allDone = group.next().makePromise(of: Int.self)

func doIt(allDone: EventLoopPromise<Int>, n: Int, accumulator: Int) {
    guard n > 0 else {
        allDone.succeed(accumulator)
        return
    }

    group.next().makeSucceededFuture(1).flatMap { one in
        group.next().makeSucceededFuture(one + 1)
    }.flatMap { two in
        group.next().makeSucceededFuture(two + 1)
    }.flatMap { three in
        group.next().makeSucceededFuture(three + 1)
    }.flatMap { four in
        group.next().makeSucceededFuture(four + 1)
    }.whenSuccess { sum in
        group.next().execute {
            doIt(allDone: allDone, n: n - 1, accumulator: accumulator + sum)
        }
    }
}

doIt(allDone: allDone, n: 10_000, accumulator: 0)
print(try allDone.futureResult.wait())

and the results were:

# perf stat -e 'syscalls:sys_enter_*' .build/release/TestProgram 1 | grep -E '[0-9]{3,}'
50000
             20023      syscalls:sys_enter_clock_gettime
       0.189682700 seconds time elapsed
       0.068381000 seconds user
       0.111246000 seconds sys

the clock_gettime only appears here because this was in Docker for Mac where the VSDO'd clock_gettime is disabled because of hypervisor deficiencies...

# perf stat -e 'syscalls:sys_enter_*' .build/release/TestProgram 4 | grep -E '[0-9]{3,}'
50000
             79613      syscalls:sys_enter_epoll_wait
             79631      syscalls:sys_enter_read
             79613      syscalls:sys_enter_write
            200046      syscalls:sys_enter_clock_gettime
       7.557578300 seconds time elapsed
       0.821708000 seconds user
       5.878427000 seconds sys

the read/write here are for the eventfd

Lukasa commented 2 years ago

Happy to close this.