Closed ric-rodriguez closed 2 years ago
Thanks for the report! So that I can reproduce it fully, are you writing any data to spipe? (for example, did you press
If you were writing data to spipe, then I believe that #274 (and in particular, d0d47f3b846cb85c0b1587ed1ddd9634825675ef) fixes it. That was added to git master after 1.6.1. (Although I note that you said you tested on master, so I'm puzzled.)
With your branch, if I don't write to spipe, then the spipe process still doesn't finish. (At least, on FreeBSD it's still going. Maybe there is a difference in how OpenBSD handles read(2)?)
I'm continuing to investigate.
Correct. If you provide any input to STDIN then it works. The problem is the thread is waiting for data from STDIN indefinitely before attempting to write it to the closed socket, causing a failure and exiting.
Unfortunately, I'm not seeing any change of behaviour due to this patch on FreeBSD. What operating system are you using? Could you please confirm that on your system, with this patch, you see spipe
stopping by itself, without any stdin input? If you do, then I'll install openbsd in a VM to investigate more.
As far as I understand, when spiped
dies, the main thread in spipe
will get to callback_pipestatus
in lib/proto/proto_conn.c
, with C->stat_f
being 1 and C->stat_r
being 0. That's not enough to trigger proto_conn_drop()
, so in turn the event loop in spiped/main.c
will not exit. If my understanding is correct, then we won't reach the proposed additions of pthread_cancel
.
Theoretically, we could change the "kill the connection" case to "if either direction has been shut down", but the
/* If both directions have been shut down, kill the connection. */
comment in callback_pipestatus
indicates that this is a deliberate choice. That code was present in the very first commit 64f9ad331df4b4febb44fddab3155bd55c90f034 (the file was then called conn.c
in the top level) on 2011-07-02.
We are using Amazon Linux 2
[root@my BUILD]# uname -a
Linux my.vaultdev.com 4.14.231-173.361.amzn2.x86_64 #1 SMP Mon Apr 26 20:57:08 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
I see now the same issue on my FreeBSD test system on master branch - while I observed the issue on both master/1.6.1 I only tested the fix against 1.6.1 (sorry about that).
You are correct, the event look is continuing based on the state
(gdb) break callback_pipestatus
Breakpoint 1 at 0x406ff0: callback_pipestatus. (2 locations)
(gdb) continue
Continuing.
Thread 1 hit Breakpoint 1, callback_pipestatus (cookie=0x801026000) at ../lib/proto/proto_conn.c:393
393 if ((C->stat_f == -1) || (C->stat_r == -1))
(gdb) set print pretty on
(gdb) p *((struct conn_state*) C)
$2 = {
callback_dead = 0x402501 <callback_conndied>,
cookie = 0x7fffffffe8f0,
sas = 0x0,
decr = 0,
nopfs = 0,
requirepfs = 0,
nokeepalive = 0,
K = 0x80101c020,
timeo = 5,
s = 4,
t = 5,
connect_cookie = 0x0,
connect_timeout_cookie = 0x0,
handshake_cookie = 0x0,
handshake_timeout_cookie = 0x0,
k_f = 0x801052000,
k_r = 0x8010520e0,
pipe_f = 0x801059000,
pipe_r = 0x801059a00,
stat_f = 1,
stat_r = 0
}
I experimented with adding events_interrupt
to workthread_cleanup
and this appeared to work
I'll do some more investigation
FWIW, the reason behind
/* If both directions have been shut down, kill the connection. */
is that we want the same behaviour as regular (not proxied via spiped) TCP connections -- each direction is shut down independently, so that you can e.g. printf(1) an HTTP request and pipe that into spipe, even though the HTTP response will arrive back long after printf exits (and thus spipe's stdin is EOFed).
Whether we're doing this correctly is an open question, of course. But that's the idea -- an error can shut down the connection in both directions, but EOF only shuts down the connection in one direction.
Right, I figured there was some reason like that. I thought that I had it saved in an email somewhere, but if so, I haven't found the email yet.
@ric-rodriguez, what sparked your interest in this? Were you auditing the code / testing the program, and discovered this behaviour? Or did you run into this via your normal usage?
If there's a use-case for for spipe
to quit as soon as the connection breaks in one direction (or a specific direction) -- unlike a TCP connection as @cperciva described -- then I could imagine this being a runtime option. Maybe something like --drop-upon-shutdown
or --drop-upon-server-shutdown
or --fragile
?
It's worth noting that in the test case described, the spiped
server shut down the connection early, not the client. So a hypothetical --drop-upon-server-shutdown
option would not affect a printf "foo\n" | spipe
construct.
Hmm... looking back at the test case, I think it will depend on how the OS handles TCP sockets owned by a process which is being terminated. If it sends a RST then spipe should get a read error and exit; if it sends a FIN then spipe will continue until it tries to send something (at which point it will receive a RST and exit).
IIRC this depends on the SO_LINGER setting and possibly on whether there's any data buffered when the socket is closed.
Since this patch wasn't successful, I've opened an issue #329 for discussing it, and I'm closing the PR.
Hello, we are seeing the following issue during spipe client cleanup.
When the spiped server connection is terminated, the client can be blocked on a read() operation. Steps to reproduce:
(gdb) continue Continuing. [Thread 0x7fbea5324700 (LWP 2441) exited] ^C Thread 1 "spipe-1.6.1" received signal SIGINT, Interrupt. 0x00007fbea5cf970a in pthread_join () from /lib64/libpthread.so.0 (gdb) info threads Id Target Id Frame
0 0x00007fbea5cf970a in pthread_join () from /lib64/libpthread.so.0
1 0x00000000004025d7 in ?? ()
2 0x00007fbea575f0ba in __libc_start_main () from /lib64/libc.so.6
3 0x000000000040265a in ?? ()
(gdb) thread 3 [Switching to thread 3 (Thread 0x7fbea4b23700 (LWP 2442))]
0 0x00007fbea5d014f3 in read () from /lib64/libpthread.so.0
(gdb) bt
0 0x00007fbea5d014f3 in read () from /lib64/libpthread.so.0
1 0x00000000004028d9 in ?? ()
2 0x00007fbea5cf840b in start_thread () from /lib64/libpthread.so.0
3 0x00007fbea582b09f in clone () from /lib64/libc.so.6
Propose to cancel threads before joining, according to the POSIX spec it should be safe to cancel on read().
Please let me know if you have any feedback on the MR.
Version: 1.6.1 (Also tested on master)