SIPp / sipp

The SIPp testing tool
https://sipp.readthedocs.io
Other
918 stars 380 forks source link

Ensure that sockets are marked as non-blocking before OpenSSL calls are made #620

Closed rkday closed 1 year ago

rkday commented 1 year ago

A colleague reported an issue where:

The problem (which I diagnosed with gdb) was that SIPp was stuck in a SSL_read call:

(gdb) bt
#0  0x00007fbe2e21a392 in __libc_read (fd=5, buf=0x55ceafd526d3, nbytes=5) at ../sysdeps/unix/sysv/linux/read.c:26
#1  0x00007fbe2df4c3d9 in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.1.1
#2  0x00007fbe2df4767e in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.1.1
#3  0x00007fbe2df464d4 in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.1.1
#4  0x00007fbe2df46aa7 in BIO_read () from /lib/x86_64-linux-gnu/libcrypto.so.1.1
#5  0x00007fbe2e198b91 in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.1
#6  0x00007fbe2e19ce1e in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.1
#7  0x00007fbe2e19a6d0 in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.1
#8  0x00007fbe2e1a1c45 in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.1
#9  0x00007fbe2e1aca3f in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.1
#10 0x00007fbe2e1acb47 in SSL_read () from /lib/x86_64-linux-gnu/libssl.so.1.1
#11 0x000055ceada5d28a in SIPpSocket::empty (this=0x55ceafd4af50) at /home/developer/src/sipp/src/socket.cpp:874
#12 0x000055ceada62f33 in SIPpSocket::pollset_process (wait=1) at /home/developer/src/sipp/src/socket.cpp:2939
#13 0x000055ceada79636 in traffic_thread (rtp_errors=@0x7ffee793fea8: 0, echo_errors=@0x7ffee793feac: 0) at /home/developer/src/sipp/src/sipp.cpp:599
#14 0x000055ceada7e08a in main (argc=14, argv=0x7ffee7940388) at /home/developer/src/sipp/src/sipp.cpp:2147

Presumably this is because it’s never sent anything, so there’s nothing coming in to read, and it can’t send anything because it’s blocking on this read.

Resizing the window sends a SIGWINCH – this causes the SSL_read to be interrupted (and return EINTR), so SIPp can move on to the tasks that send messages, and then there’s a response so the subsequent SSL_read reads it, and everyone’s happy. Any other signal, e.g. SIGUSR2, would have the same effect.

This fix seems to work, marking the socket as nonblocking before we ever try and read from it.

I think this is needed for TLS and not TCP because, per https://stackoverflow.com/a/41948239, , “the SSL setup calls will examine the state of your fd and set private variables within the BIO and the SSL objects based on the blocking/non-blocking state of the fd”. A later bit of code sets O_NONBLOCK, and that’s good enough for TCP, but I think by that time OpenSSL has already put the socket in blocking mode, because it was in blocking mode at the time of the setup calls.

rkday commented 1 year ago

@wdoekes , I'm thinking of tagging 3.7.0 once this, #618 and #619 are in - what do you think?

wdoekes commented 1 year ago

I think there are some others that would be nice to have:

Maybe 613 and 617 can go in as-is, but make a new ticket to fix them up properly.

wdoekes commented 1 year ago

And #581 and maybe #565.

rkday commented 1 year ago

OK:

I'll probably merge all of #617 through #622 in the next day or two then tag 3.7.0 - shout if you'd like me to hold off, or if you get a repro for #565 and want that fix in too.

wdoekes commented 1 year ago

Thanks!

I don't have time to work on these now. I think we can do without 565.

Go ahead and merge and tag :)