Tarsnap / spiped

Spiped is a utility for creating symmetrically encrypted and authenticated pipes between socket addresses.
http://www.tarsnap.com/spiped.html
Other
855 stars 56 forks source link

spipe/main.c: Fix spiped thread stuck on read() operation when remote… #328

Closed ric-rodriguez closed 2 years ago

ric-rodriguez commented 2 years ago

Hello, we are seeing the following issue during spipe client cleanup.

When the spiped server connection is terminated, the client can be blocked on a read() operation. Steps to reproduce:

  1. Start spiped daemon
    [root@my BUILD]# ./spiped -d -s [0.0.0.0]:1234 -t [127.0.0.1]:22 -k ./keyfile -p spiped.pid -F
  2. Start spipe client
    [root@my BUILD]# ./spipe -t 127.0.0.1:1234 -k ./keyfile
    SSH-2.0-OpenSSH_7.4
  3. Teminate spiped daemon
    [root@my BUILD]# ./spiped -d -s [0.0.0.0]:1234 -t [127.0.0.1]:22 -k ./keyfile -p spiped.pid -F
    ^C
    [root@my BUILD]#
  4. Spipe client does not exit
    
    [root@my BUILD]# ./spipe -t 127.0.0.1:1234 -k ./keyfile
    SSH-2.0-OpenSSH_7.4

The expected behavior is that the spipe client exits when the connection is lost

Attaching gdb to the spipe process reveals that it is blocked on a read operation

(gdb) continue Continuing. [Thread 0x7fbea5324700 (LWP 2441) exited] ^C Thread 1 "spipe-1.6.1" received signal SIGINT, Interrupt. 0x00007fbea5cf970a in pthread_join () from /lib64/libpthread.so.0 (gdb) info threads Id Target Id Frame

Propose to cancel threads before joining, according to the POSIX spec it should be safe to cancel on read().

Please let me know if you have any feedback on the MR.

Version: 1.6.1 (Also tested on master)

gperciva commented 2 years ago

Thanks for the report! So that I can reproduce it fully, are you writing any data to spipe? (for example, did you press ?)

If you were writing data to spipe, then I believe that #274 (and in particular, d0d47f3b846cb85c0b1587ed1ddd9634825675ef) fixes it. That was added to git master after 1.6.1. (Although I note that you said you tested on master, so I'm puzzled.)

With your branch, if I don't write to spipe, then the spipe process still doesn't finish. (At least, on FreeBSD it's still going. Maybe there is a difference in how OpenBSD handles read(2)?)

I'm continuing to investigate.

ric-rodriguez commented 2 years ago

Correct. If you provide any input to STDIN then it works. The problem is the thread is waiting for data from STDIN indefinitely before attempting to write it to the closed socket, causing a failure and exiting.

gperciva commented 2 years ago

Unfortunately, I'm not seeing any change of behaviour due to this patch on FreeBSD. What operating system are you using? Could you please confirm that on your system, with this patch, you see spipe stopping by itself, without any stdin input? If you do, then I'll install openbsd in a VM to investigate more.

As far as I understand, when spiped dies, the main thread in spipe will get to callback_pipestatus in lib/proto/proto_conn.c, with C->stat_f being 1 and C->stat_r being 0. That's not enough to trigger proto_conn_drop(), so in turn the event loop in spiped/main.c will not exit. If my understanding is correct, then we won't reach the proposed additions of pthread_cancel.

Theoretically, we could change the "kill the connection" case to "if either direction has been shut down", but the

/* If both directions have been shut down, kill the connection. */

comment in callback_pipestatus indicates that this is a deliberate choice. That code was present in the very first commit 64f9ad331df4b4febb44fddab3155bd55c90f034 (the file was then called conn.c in the top level) on 2011-07-02.

ric-rodriguez commented 2 years ago

We are using Amazon Linux 2

[root@my BUILD]# uname -a
Linux my.vaultdev.com 4.14.231-173.361.amzn2.x86_64 #1 SMP Mon Apr 26 20:57:08 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

I see now the same issue on my FreeBSD test system on master branch - while I observed the issue on both master/1.6.1 I only tested the fix against 1.6.1 (sorry about that).

You are correct, the event look is continuing based on the state

(gdb) break callback_pipestatus
Breakpoint 1 at 0x406ff0: callback_pipestatus. (2 locations)
(gdb) continue
Continuing.

Thread 1 hit Breakpoint 1, callback_pipestatus (cookie=0x801026000) at ../lib/proto/proto_conn.c:393
393     if ((C->stat_f == -1) || (C->stat_r == -1))
(gdb) set print pretty on
(gdb) p *((struct conn_state*) C)
$2 = {
  callback_dead = 0x402501 <callback_conndied>,
  cookie = 0x7fffffffe8f0,
  sas = 0x0,
  decr = 0,
  nopfs = 0,
  requirepfs = 0,
  nokeepalive = 0,
  K = 0x80101c020,
  timeo = 5,
  s = 4,
  t = 5,
  connect_cookie = 0x0,
  connect_timeout_cookie = 0x0,
  handshake_cookie = 0x0,
  handshake_timeout_cookie = 0x0,
  k_f = 0x801052000,
  k_r = 0x8010520e0,
  pipe_f = 0x801059000,
  pipe_r = 0x801059a00,
  stat_f = 1,
  stat_r = 0
}

I experimented with adding events_interrupt to workthread_cleanup and this appeared to work

I'll do some more investigation

cperciva commented 2 years ago

FWIW, the reason behind

/* If both directions have been shut down, kill the connection. */

is that we want the same behaviour as regular (not proxied via spiped) TCP connections -- each direction is shut down independently, so that you can e.g. printf(1) an HTTP request and pipe that into spipe, even though the HTTP response will arrive back long after printf exits (and thus spipe's stdin is EOFed).

Whether we're doing this correctly is an open question, of course. But that's the idea -- an error can shut down the connection in both directions, but EOF only shuts down the connection in one direction.

gperciva commented 2 years ago

Right, I figured there was some reason like that. I thought that I had it saved in an email somewhere, but if so, I haven't found the email yet.

@ric-rodriguez, what sparked your interest in this? Were you auditing the code / testing the program, and discovered this behaviour? Or did you run into this via your normal usage?

If there's a use-case for for spipe to quit as soon as the connection breaks in one direction (or a specific direction) -- unlike a TCP connection as @cperciva described -- then I could imagine this being a runtime option. Maybe something like --drop-upon-shutdown or --drop-upon-server-shutdown or --fragile?

It's worth noting that in the test case described, the spiped server shut down the connection early, not the client. So a hypothetical --drop-upon-server-shutdown option would not affect a printf "foo\n" | spipe construct.

cperciva commented 2 years ago

Hmm... looking back at the test case, I think it will depend on how the OS handles TCP sockets owned by a process which is being terminated. If it sends a RST then spipe should get a read error and exit; if it sends a FIN then spipe will continue until it tries to send something (at which point it will receive a RST and exit).

IIRC this depends on the SO_LINGER setting and possibly on whether there's any data buffered when the socket is closed.

gperciva commented 2 years ago

Since this patch wasn't successful, I've opened an issue #329 for discussing it, and I'm closing the PR.