intel / asynch_mode_nginx

Other
210 stars 60 forks source link

endless epoll operation lead to 100% cpu usage #82

Open mcdullbloom opened 1 month ago

mcdullbloom commented 1 month ago

just the same as #80 nginx jump into a loop of epoll event delete, add and mod which lead cpu 100%

read(3374, 0x5644fd769103, 16709)       = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(784, EPOLL_CTL_DEL, 3374, 0x7fffa1fb1814) = 0
epoll_ctl(784, EPOLL_CTL_ADD, 3374, {EPOLLIN|EPOLLRDHUP|EPOLLET, {u32=4162242016, u64=140123175306720}}) = 0
epoll_ctl(784, EPOLL_CTL_MOD, 3374, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=4162242016, u64=140123175306720}}) = 0
epoll_wait(784, [{EPOLLOUT, {u32=4162242016, u64=140123175306720}}], 512, 440) = 1
read(3374, 0x5644fd769103, 16709)       = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(784, EPOLL_CTL_DEL, 3374, 0x7fffa1fb1814) = 0
epoll_ctl(784, EPOLL_CTL_ADD, 3374, {EPOLLIN|EPOLLRDHUP|EPOLLET, {u32=4162242016, u64=140123175306720}}) = 0
epoll_ctl(784, EPOLL_CTL_MOD, 3374, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=4162242016, u64=140123175306720}}) = 0
epoll_wait(784, [{EPOLLOUT, {u32=4162242016, u64=140123175306720}}], 512, 440) = 1
read(3374, 0x5644fd769103, 16709)       = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(784, EPOLL_CTL_DEL, 3374, 0x7fffa1fb1814) = 0
epoll_ctl(784, EPOLL_CTL_ADD, 3374, {EPOLLIN|EPOLLRDHUP|EPOLLET, {u32=4162242016, u64=140123175306720}}) = 0
epoll_ctl(784, EPOLL_CTL_MOD, 3374, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=4162242016, u64=140123175306720}}) = 0
epoll_wait(784, [{EPOLLOUT, {u32=4162242016, u64=140123175306720}}], 512, 440) = 1
read(3374, 0x5644fd769103, 16709)       = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(784, EPOLL_CTL_DEL, 3374, 0x7fffa1fb1814) = 0
epoll_ctl(784, EPOLL_CTL_ADD, 3374, {EPOLLIN|EPOLLRDHUP|EPOLLET, {u32=4162242016, u64=140123175306720}}) = 0
epoll_ctl(784, EPOLL_CTL_MOD, 3374, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=4162242016, u64=140123175306720}}) = 0
epoll_wait(784, [{EPOLLOUT, {u32=4162242016, u64=140123175306720}}], 512, 439) = 1
read(3374, 0x5644fd769103, 16709)       = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(784, EPOLL_CTL_DEL, 3374, 0x7fffa1fb1814) = 0
epoll_ctl(784, EPOLL_CTL_ADD, 3374, {EPOLLIN|EPOLLRDHUP|EPOLLET, {u32=4162242016, u64=140123175306720}}) = 0
epoll_ctl(784, EPOLL_CTL_MOD, 3374, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=4162242016, u64=140123175306720}}) = 0
epoll_wait(784, [{EPOLLOUT, {u32=4162242016, u64=140123175306720}}], 512, 439) = 1
read(3374, 0x5644fd769103, 16709)       = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(784, EPOLL_CTL_DEL, 3374, 0x7fffa1fb1814) = 0
epoll_ctl(784, EPOLL_CTL_ADD, 3374, {EPOLLIN|EPOLLRDHUP|EPOLLET, {u32=4162242016, u64=140123175306720}}) = 0
epoll_ctl(784, EPOLL_CTL_MOD, 3374, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=4162242016, u64=140123175306720}}) = 0
epoll_wait(784, [{EPOLLOUT, {u32=4162242016, u64=140123175306720}}], 512, 439) = 1
read(3374, 0x5644fd769103, 16709)       = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(784, EPOLL_CTL_DEL, 3374, 0x7fffa1fb1814) = 0
……

version

asynch_mode_nginx v0.5.0 Openssl-1.1.1s QAT engine v0.6.18

config

ssl_asynch  on;
……
ssl_engine {
    use_engine qatengine;
    default_algorithms ALL;
    qat_engine {
        qat_offload_mode async;
        qat_notify_mode poll;
        qat_external_poll_interval 1;
        qat_poll_mode internal;
    }
}

steps to reproduce

  1. client send shot https connection(add header "Connection": close) and blocked after that always . You can refer the code from #80 at the end of the issue.
  2. nginx call SSL_shutdown to close ssl connection and it returns -1 as return value.
  3. SSL_get_error returns SSL_ERROR_WANT_READ

Three method below make nginx worker jumping into a loop. I don't know the code of ssl asynch and qat engine.So who guys can make a pr to fix the bug? image

hardikpatel9 commented 1 month ago

Hi @mcdullbloom ,

Can you please try with the latest version as per below, QAT Engine: v1.6.1 or latest Crypto libraries: OpenSSL 3.0.14 NGINX :- Release tagv0.5.1

mcdullbloom commented 1 month ago

@hardikpatel9 It seems the bug of asynch nginx and doesn't mater with qatengine and openssl.You can reproduce it by constraining SSL_get_error return SSL_ERROR_WANT_READ in the mode of qat-sw.

hardikpatel9 commented 1 month ago

Hi @mcdullbloom ,

Updated the letest findings on https://github.com/intel/asynch_mode_nginx/issues/80