openssh-rust / openssh

Scriptable SSH through OpenSSH in Rust
Apache License 2.0
232 stars 37 forks source link

`wait_with_output` + `native-mux` cuts off stdout output #103

Closed Fuuzetsu closed 1 year ago

Fuuzetsu commented 2 years ago

I have code that's roughly like follows:

// …
let std::process::Output {
        status,
        stderr,
        stdout,
    } = c.spawn().await?.wait_with_output().await?;
// …
let content = &stdout;
tracing::warn!(
            "daily content is {} long and ends with {:?}",
            content.len(),
            content.iter().rev().next()
        );
// …

The command being ran on the remote produces relatively large amount of output on a single line. When I'm using native-mux feature, this output is incomplete!

With native-mux I see:

2022-10-25T05:43:55.091989Z  WARN start_sim::live: daily content is 147456 long and ends with Some(110)

but if I change to process-mux and change connect_mux to connect, I see

2022-10-25T05:42:40.638584Z  WARN start_sim::live: daily content is 410777 long and ends with Some(10)

The process-mux is correct.

I don't have minimum reproducible example at hand but I suspect one can just create a large file (say 1MiB) on a remote and spawn cat large-file via openssh and reproduce the issue.

I tried with 0.9.7, same result.

NobodyXu commented 2 years ago

Hmmm, I'm not sure where goes wrong.

FYI, native-mux uses openssh_mux_client::Connection::open_new_session to create process on remote machine.

It creates a pipe and passes the write end of it to the ssh multiplex server over unix socket, and then the multiplex server is responsible for communicating with sshd and read all data into the fd passed to it.

It could be that:

NobodyXu commented 2 years ago

The implementation of RemoteChild::wait_for_output is the same for both process-mux and native-mux, which uses tokio::try_join! to wait and read from stdout/stderr at the same time.

Fuuzetsu commented 2 years ago

Are you able to replicate? Should I write some program to demonstrate this if not?

NobodyXu commented 2 years ago

Are you able to replicate? Should I write some program to demonstrate this if not?

It would be great if you can write a program to demonstrate this. I haven't tried to replicate yet, but I will give it a shot later today.

NobodyXu commented 2 years ago

I've created a new branch fix/cut-off and I am able to reproduce this:

Testing native-mux implementation
thread 'test_read_large_file_bug' panicked at 'assertion failed: `(left == right)`
  left: `20961280`,
 right: `20971520`', tests/openssh.rs:965:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
NobodyXu commented 2 years ago

TBH, I don't know which line of code is to be blamed here. If something went wrong in openssh-mux-client or openssh, it is likely other integration tests will fail.

Having stdout output cuts off is really strange and something I haven't seen before. I suspect that this could be a bug in ssh, or the ssh multiplex protocol still has some quicks I haven't discovered yet.

NobodyXu commented 2 years ago

I have obtained the log from the ssh multiplex server with the max verbosity (truncated unrelated information):

OpenSSH_8.9p1 Ubuntu-3, OpenSSL 3.0.2 15 Mar 2022
debug3: muxserver_listen: temporary control path /home/nobodyxu/.local/state/.ssh-connection5bcbfD/master.EzMCnqZG4lBmYIXI
debug2: fd 5 setting O_NONBLOCK
debug3: fd 5 is O_NONBLOCK
debug3: fd 5 is O_NONBLOCK
debug1: channel 0: new [/home/nobodyxu/.local/state/.ssh-connection5bcbfD/master]
debug3: muxserver_listen: mux listener channel 0 fd 5
debug2: fd 4 setting TCP_NODELAY
debug3: set_sock_tos: set socket 4 IP_TOS 0x08
debug1: forking to background
debug1: Entering interactive session.
debug1: pledge: id
debug3: receive packet: type 80
debug1: client_input_global_request: rtype hostkeys-00@openssh.com want_reply 0
debug3: receive packet: type 4
debug1: Remote: /config/.ssh/authorized_keys:1: key options: agent-forwarding port-forwarding pty user-rc x11-forwarding
debug3: receive packet: type 4
debug1: Remote: /config/.ssh/authorized_keys:1: key options: agent-forwarding port-forwarding pty user-rc x11-forwarding
debug1: multiplexing control connection
debug2: fd 6 setting O_NONBLOCK
debug3: fd 6 is O_NONBLOCK
debug1: channel 1: new [mux-control]
debug3: channel_post_mux_listener: new mux channel 1 fd 6
debug3: mux_master_read_cb: channel 1: hello sent
debug3: mux_master_read_cb: channel 1 packet type 0x00000001 len 4
debug2: mux_master_process_hello: channel 1 client version 4
debug3: mux_master_read_cb: channel 1 packet type 0x10000002 len 93
debug2: mux_master_process_new_session: channel 1: request tty 0, X 0, agent 0, subsys 0, term "xterm-256color", cmd "sh -c 'dd if=/dev/zero bs=1024 count=20480'", env 0
debug3: mm_receive_fd: recvmsg: Resource temporarily unavailable
debug3: mux_master_process_new_session: got fds stdin 7, stdout 8, stderr 9
debug1: channel 2: new [client-session]
debug2: mux_master_process_new_session: channel_new: 2 linked to control channel 1
debug2: channel 2: send open
debug3: send packet: type 90
debug3: receive packet: type 91
debug2: channel_input_open_confirmation: channel 2: callback start
debug2: client_session2_setup: id 2
debug1: Sending command: sh -c 'dd if=/dev/zero bs=1024 count=20480'
debug2: channel 2: request exec confirm 1
debug3: send packet: type 98
debug3: mux_session_confirm: sending success reply
debug2: channel_input_open_confirmation: channel 2: callback done
debug2: channel 2: open confirm rwindow 0 rmax 32768
debug2: channel 2: rcvd adjust 2097152
debug3: receive packet: type 99
debug2: channel_input_status_confirm: type 99 id 2
debug2: exec request accepted on channel 2
debug2: channel 2: read failed rfd 7 maxlen 32768: Broken pipe
debug2: channel 2: read failed
debug2: chan_shutdown_read: channel 2: (i0 o0 sock -1 wfd 7 efd 9 [write])
debug2: channel 2: input open -> drain
debug2: channel 2: ibuf empty
debug2: channel 2: send eof
debug3: send packet: type 96
debug2: channel 2: input drain -> closed
debug2: channel 2: window 1992704 sent adjust 104448
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: window 1982464 sent adjust 114688
debug2: channel 2: rcvd ext data 99
debug3: receive packet: type 98
debug1: client_input_channel_req: channel 2 rtype exit-status reply 0
debug3: mux_exit_message: channel 2: exit message, exitval 0
debug2: channel 2: written 99 to efd 9
debug2: channel 1: ctl read<=0 rfd 6 len 0
debug2: channel 1: read failed
debug2: chan_shutdown_read: channel 1: (i0 o0 sock 6 wfd 6 efd -1 [closed])
debug2: channel 1: input open -> drain
debug2: channel 1: ibuf empty
debug2: channel 1: input drain -> closed
debug2: channel 1: rcvd close
debug2: channel 1: output open -> drain
debug2: channel 1: obuf empty
debug2: chan_shutdown_write: channel 1: (i3 o1 sock 6 wfd 6 efd -1 [closed])
debug2: channel 1: output drain -> closed
debug2: channel 1: is dead (local)
debug2: channel 1: gc: notify user
debug3: mux_master_control_cleanup_cb: entering for channel 1
debug2: channel 2: write failed
debug2: chan_shutdown_write: channel 2: (i3 o0 sock -1 wfd 8 efd 9 [write])
debug2: channel 2: send eow
debug3: send packet: type 98
debug2: channel 2: output open -> closed
debug2: channel 1: gc: user detached
debug2: channel 1: is dead (local)
debug2: channel 1: garbage collecting
debug1: channel 1: free: mux-control, nchannels 3
debug3: channel 1: status: The following connections are open:
  #1 mux-control (t16 nr0 i3/0 o3/0 e[closed]/0 fd 6/6/-1 sock 6 cc -1 io 0x00/0x01)
  #2 client-session (t4 r0 i3/0 o3/0 e[write]/0 fd -1/-1/9 sock -1 cc -1 io 0x00/0x00)

debug2: channel 2: send close
debug3: send packet: type 97
debug3: receive packet: type 96
debug2: channel 2: rcvd eof
debug3: receive packet: type 97
debug2: channel 2: rcvd close
debug3: channel 2: will not send data after close
debug2: channel 2: is dead
debug2: channel 2: gc: notify user
debug3: mux_master_session_cleanup_cb: entering for channel 2
debug2: channel 2: gc: user detached
debug2: channel 2: is dead
debug2: channel 2: garbage collecting
debug1: channel 2: free: client-session, nchannels 2
debug3: channel 2: status: The following connections are open:
  #2 client-session (t4 r0 i3/0 o3/0 e[write]/0 fd -1/-1/9 sock -1 cc -1 io 0x00/0x00)
NobodyXu commented 2 years ago

What's particularly interesting is the following lines:

debug2: channel 2: write failed
debug2: chan_shutdown_write: channel 2: (i3 o0 sock -1 wfd 8 efd 9 [write])
debug2: channel 2: send eow
debug3: send packet: type 98
debug2: channel 2: output open -> closed

looks like somehow it cannot write to the stdout fd we sent them.

Reading other similar failures from the log, it seems that ssh also "failed to read" from /dev/null fd and ssh does not set O_NONBLOCK on the fd received from mux-control.

So could it be that the pipe is full and returns EAGAIN, and ssh treats that as a failure?

NobodyXu commented 2 years ago

I've disabled non-blocking mode on all fd passed to the ssh multiplex server but it still cuts of the output and has debug2: channel 2: write failed.

NobodyXu commented 2 years ago

The test sometimes passes and sometimes fails. It seems that the problem might be that the channel is closed before all data is consumed

NobodyXu commented 2 years ago

I think this is a problem that requires breaking change in openssh-mux-client to fix due to the API dropping the unix connection too early.

I just realized that we don't need a breaking change afterall.

NobodyXu commented 1 year ago

@Fuuzetsu openssh v0.9.8 has released!