openssh-rust / openssh

Scriptable SSH through OpenSSH in Rust
Apache License 2.0
232 stars 35 forks source link

SshSession::connect_mux hangs on MacOS 14.5 #150

Open justin-elementlabs opened 1 month ago

justin-elementlabs commented 1 month ago

When running the example code, Rust will print "SSH before" but fail to print "SSH after". After a while, I will see a warning about my test "has been running for over 60 seconds". It will continue hang with no error or further output.

Am I missing libraries on my machine? Are there other suggestions on how to find out what is causing this to hang? The russh crate has the same issue.

println!("SSH before"); let result = SshSession::connect_mux( format!( "ssh://{}@{}:{}", &credentials.username, &credentials.host, &credentials.port ), KnownHosts::Add, ) .await; println!("SSH after");

NobodyXu commented 1 month ago

sshsession::connect_mux is usually used when you already creates a ssh multiplex master.

Usually you want to use SessionBuilder

justin-elementlabs commented 1 month ago

@NobodyXu , thanks for the tip. Maybe you could help with a full example? This is what I have right now.

It's hanging after printing "SFTP 201".

If I run ps -ax | grep ssh I do see

ssh -E .../.local/state/.ssh-connectionbb4jRB/log -S .../.local/state/.ssh-connectionbb4jRB/master -M -f -N -o ControlPersist=yes -o BatchMode=yes -o StrictHostKeyChecking=accept-new -p 22 -l ... -o IdentitiesOnly=yes -I ... ssh://...

 let mut b = SshSessionBuilder::default();
b.keyfile("...");
let dest = format!(
    "ssh://{}@{}:{}",
    &credentials.username, &credentials.host, &credentials.port
);
let (b, d) = b.resolve(&dest);
println!("SFTP 201");

let temp_dir = b.launch_master(&dest).await; // What do I do with temp_dir?
println!("SFTP 202");

let result = SshSession::connect(&dest,
    KnownHosts::Add,
)
.await;
println!("SFTP 203");
if result.is_err() {
    return Err(format!(
        "Unable to connect to SSH server at {}:{} (error: {})",
        &credentials.host,
        &credentials.port,
        result.err().unwrap()
    ));
}
println!("SFTP 204");

let ssh_session = result.unwrap();

let result = _Sftp::from_session(ssh_session, Default::default()).await;
if result.is_err() {
    return Err(format!(
        "Unable to establish to SFTP session from SSH session at {}:{} (error: {})",
        &credentials.host,
        &credentials.port,
        result.err().unwrap()
    ));
}
println!("SFTP 220");
NobodyXu commented 1 month ago

Connecting is quite simple:

let session = SshSessionBuilder::default()
    .keyfile("...")
    .connect_mux(format!(
        "ssh://{}@{}:{}",
    &credentials.username, &credentials.host, &credentials.port
)).await?;
justin-elementlabs commented 1 month ago

It's still hanging following the example provided after SFTP 201:

println!("SFTP 200");

let mut b = SshSessionBuilder::default();
let b = b.keyfile("...");
let dest = format!(
    "ssh://{}@{}:{}",
    &credentials.username, &credentials.host, &credentials.port
);

println!("SFTP 201");
let result = b.connect_mux(&dest).await;
println!("SFTP 202");

if result.is_err() {
    return Err(format!(
        "Unable to connect to SSH server at {}:{} (error: {})",
        &credentials.host,
        &credentials.port,
        result.err().unwrap()
    ));
}
println!("SFTP 209");

let ssh_session = result.unwrap();

let result = _Sftp::from_session(ssh_session, Default::default()).await;
if result.is_err() {
    return Err(format!(
        "Unable to establish to SFTP session from SSH session at {}:{} (error: {})",
        &credentials.host,
        &credentials.port,
        result.err().unwrap()
    ));
}
println!("SFTP 220");
justin-elementlabs commented 1 month ago

If I run ps -ax | grep ssh I do see

ssh -E .../.local/state/.ssh-connectionbb4jRB/log -S .../.local/state/.ssh-connectionbb4jRB/master -M -f -N -o ControlPersist=yes -o BatchMode=yes -o StrictHostKeyChecking=accept-new -p 22 -l ... -o IdentitiesOnly=yes -I ... ssh://...

But Rust just hangs

NobodyXu commented 1 month ago

I think it might be the ssh multiplex master that is hanging, maybe that the remote host is hanging somehow?

Can you login to the remote host using ssh on cmdline?

justin-elementlabs commented 1 month ago

@NobodyXu , yes I can ssh fine using command line.

NobodyXu commented 1 month ago

So multiplex master is not working for some reason...

Can you try connecting to the multiplex master directly, using ssh?

justin-elementlabs commented 1 month ago

@NobodyXu , yes this works:

ssh -S /.../.local/state/.ssh-connectionyNAAbu/master ip

NobodyXu commented 1 month ago

Can you try:

let session = SshSessionBuilder::default()
    .keyfile("...")
    .connect(format!(
        "ssh://{}@{}:{}",
    &credentials.username, &credentials.host, &credentials.port
)).await?;

Maybe it's the openssh-mux-client not working?

justin-elementlabs commented 1 month ago

@NobodyXu It's the same unfortunately. It will hang when connect is called and only print SFTP 201

println!("SFTP 201");
let result = SshSessionBuilder::default().keyfile("...").connect(&dest).await;
println!("SFTP 202");
NobodyXu commented 1 month ago

Ok I think I misunderstood the issue.

It actually stucks in launch_master, it's likely the ssh command never exits.

We expect the ssh to fork and create a server process in background, and then returns.

It's likely not the case here.

NobodyXu commented 1 month ago

ssh -E .../.local/state/.ssh-connectionbb4jRB/log -S .../.local/state/.ssh-connectionbb4jRB/master -M -f -N -o ControlPersist=yes -o BatchMode=yes -o StrictHostKeyChecking=accept-new -p 22 -l ... -o IdentitiesOnly=yes -I ... ssh://...

@justin-elementlabs if you execute this command manually, does it exit immediately or stuck?

justin-elementlabs commented 1 month ago

@NobodyXu it will run for 1-2 seconds and then return to the terminal (exit immediately)

% ssh -E .../.local/state/.ssh-connectionbb4jRB/log -S .../.local/state/.ssh-connectionbb4jRB/master -M -f -N -o ControlPersist=yes -o BatchMode=yes -o StrictHostKeyChecking=accept-new -p 22 -l ... -o IdentitiesOnly=yes -I ... ...

(1-2 seconds later with no more user input...)

%

NobodyXu commented 1 month ago

In openssh, we just wait for the procees to exit and check its status https://docs.rs/openssh/latest/src/openssh/builder.rs.html#487

Given that we are using tokio::process, I think something might be wrong with it? @justin-elementlabs

NobodyXu commented 1 month ago

Which tokio version are you using?

And what is the kernel (linux or macOS)?

I strongly suspect it's a bug in tokio

justin-elementlabs commented 1 month ago

[[package]] name = "tokio" version = "1.38.0"

macOS 14.5

NobodyXu commented 1 month ago

I recommend to update to latest tokio (1.39.2), if it still doesn't work, it could be a tokio bug, try launching that ssh command using tokio::process directly, if that stucks with tokio but not your cmdline, then it could be a tokio bug.

justin-elementlabs commented 1 month ago

@NobodyXu , upgrading tokio didn't help. I created an issue just FYI: https://github.com/tokio-rs/tokio/issues/6770