openssh-rust / openssh

Scriptable SSH through OpenSSH in Rust
Apache License 2.0
234 stars 38 forks source link

SshSession::connect_mux hangs on MacOS 14.5 #150

Open justin-elementlabs opened 3 months ago

justin-elementlabs commented 3 months ago

When running the example code, Rust will print "SSH before" but fail to print "SSH after". After a while, I will see a warning about my test "has been running for over 60 seconds". It will continue hang with no error or further output.

Am I missing libraries on my machine? Are there other suggestions on how to find out what is causing this to hang? The russh crate has the same issue.

println!("SSH before"); let result = SshSession::connect_mux( format!( "ssh://{}@{}:{}", &credentials.username, &credentials.host, &credentials.port ), KnownHosts::Add, ) .await; println!("SSH after");

NobodyXu commented 3 months ago

sshsession::connect_mux is usually used when you already creates a ssh multiplex master.

Usually you want to use SessionBuilder

justin-elementlabs commented 3 months ago

@NobodyXu , thanks for the tip. Maybe you could help with a full example? This is what I have right now.

It's hanging after printing "SFTP 201".

If I run ps -ax | grep ssh I do see

ssh -E .../.local/state/.ssh-connectionbb4jRB/log -S .../.local/state/.ssh-connectionbb4jRB/master -M -f -N -o ControlPersist=yes -o BatchMode=yes -o StrictHostKeyChecking=accept-new -p 22 -l ... -o IdentitiesOnly=yes -I ... ssh://...

 let mut b = SshSessionBuilder::default();
b.keyfile("...");
let dest = format!(
    "ssh://{}@{}:{}",
    &credentials.username, &credentials.host, &credentials.port
);
let (b, d) = b.resolve(&dest);
println!("SFTP 201");

let temp_dir = b.launch_master(&dest).await; // What do I do with temp_dir?
println!("SFTP 202");

let result = SshSession::connect(&dest,
    KnownHosts::Add,
)
.await;
println!("SFTP 203");
if result.is_err() {
    return Err(format!(
        "Unable to connect to SSH server at {}:{} (error: {})",
        &credentials.host,
        &credentials.port,
        result.err().unwrap()
    ));
}
println!("SFTP 204");

let ssh_session = result.unwrap();

let result = _Sftp::from_session(ssh_session, Default::default()).await;
if result.is_err() {
    return Err(format!(
        "Unable to establish to SFTP session from SSH session at {}:{} (error: {})",
        &credentials.host,
        &credentials.port,
        result.err().unwrap()
    ));
}
println!("SFTP 220");
NobodyXu commented 3 months ago

Connecting is quite simple:

let session = SshSessionBuilder::default()
    .keyfile("...")
    .connect_mux(format!(
        "ssh://{}@{}:{}",
    &credentials.username, &credentials.host, &credentials.port
)).await?;
justin-elementlabs commented 3 months ago

It's still hanging following the example provided after SFTP 201:

println!("SFTP 200");

let mut b = SshSessionBuilder::default();
let b = b.keyfile("...");
let dest = format!(
    "ssh://{}@{}:{}",
    &credentials.username, &credentials.host, &credentials.port
);

println!("SFTP 201");
let result = b.connect_mux(&dest).await;
println!("SFTP 202");

if result.is_err() {
    return Err(format!(
        "Unable to connect to SSH server at {}:{} (error: {})",
        &credentials.host,
        &credentials.port,
        result.err().unwrap()
    ));
}
println!("SFTP 209");

let ssh_session = result.unwrap();

let result = _Sftp::from_session(ssh_session, Default::default()).await;
if result.is_err() {
    return Err(format!(
        "Unable to establish to SFTP session from SSH session at {}:{} (error: {})",
        &credentials.host,
        &credentials.port,
        result.err().unwrap()
    ));
}
println!("SFTP 220");
justin-elementlabs commented 3 months ago

If I run ps -ax | grep ssh I do see

ssh -E .../.local/state/.ssh-connectionbb4jRB/log -S .../.local/state/.ssh-connectionbb4jRB/master -M -f -N -o ControlPersist=yes -o BatchMode=yes -o StrictHostKeyChecking=accept-new -p 22 -l ... -o IdentitiesOnly=yes -I ... ssh://...

But Rust just hangs

NobodyXu commented 3 months ago

I think it might be the ssh multiplex master that is hanging, maybe that the remote host is hanging somehow?

Can you login to the remote host using ssh on cmdline?

justin-elementlabs commented 3 months ago

@NobodyXu , yes I can ssh fine using command line.

NobodyXu commented 3 months ago

So multiplex master is not working for some reason...

Can you try connecting to the multiplex master directly, using ssh?

justin-elementlabs commented 3 months ago

@NobodyXu , yes this works:

ssh -S /.../.local/state/.ssh-connectionyNAAbu/master ip

NobodyXu commented 3 months ago

Can you try:

let session = SshSessionBuilder::default()
    .keyfile("...")
    .connect(format!(
        "ssh://{}@{}:{}",
    &credentials.username, &credentials.host, &credentials.port
)).await?;

Maybe it's the openssh-mux-client not working?

justin-elementlabs commented 3 months ago

@NobodyXu It's the same unfortunately. It will hang when connect is called and only print SFTP 201

println!("SFTP 201");
let result = SshSessionBuilder::default().keyfile("...").connect(&dest).await;
println!("SFTP 202");
NobodyXu commented 3 months ago

Ok I think I misunderstood the issue.

It actually stucks in launch_master, it's likely the ssh command never exits.

We expect the ssh to fork and create a server process in background, and then returns.

It's likely not the case here.

NobodyXu commented 3 months ago

ssh -E .../.local/state/.ssh-connectionbb4jRB/log -S .../.local/state/.ssh-connectionbb4jRB/master -M -f -N -o ControlPersist=yes -o BatchMode=yes -o StrictHostKeyChecking=accept-new -p 22 -l ... -o IdentitiesOnly=yes -I ... ssh://...

@justin-elementlabs if you execute this command manually, does it exit immediately or stuck?

justin-elementlabs commented 3 months ago

@NobodyXu it will run for 1-2 seconds and then return to the terminal (exit immediately)

% ssh -E .../.local/state/.ssh-connectionbb4jRB/log -S .../.local/state/.ssh-connectionbb4jRB/master -M -f -N -o ControlPersist=yes -o BatchMode=yes -o StrictHostKeyChecking=accept-new -p 22 -l ... -o IdentitiesOnly=yes -I ... ...

(1-2 seconds later with no more user input...)

%

NobodyXu commented 3 months ago

In openssh, we just wait for the procees to exit and check its status https://docs.rs/openssh/latest/src/openssh/builder.rs.html#487

Given that we are using tokio::process, I think something might be wrong with it? @justin-elementlabs

NobodyXu commented 3 months ago

Which tokio version are you using?

And what is the kernel (linux or macOS)?

I strongly suspect it's a bug in tokio

justin-elementlabs commented 3 months ago

[[package]] name = "tokio" version = "1.38.0"

macOS 14.5

NobodyXu commented 3 months ago

I recommend to update to latest tokio (1.39.2), if it still doesn't work, it could be a tokio bug, try launching that ssh command using tokio::process directly, if that stucks with tokio but not your cmdline, then it could be a tokio bug.

justin-elementlabs commented 3 months ago

@NobodyXu , upgrading tokio didn't help. I created an issue just FYI: https://github.com/tokio-rs/tokio/issues/6770

sander2 commented 4 weeks ago

@justin-elementlabs assuming that this is still an issue, I think it might be caused by you using executor::block_on: https://github.com/elementlabs42/BitVM-playground/blob/e102f23d88fee5c46f8e6f86442f371e857016ba/src/bridge/client/data_store/sftp.rs#L208 .If you call the function from a regular async function in a tokio context it works for me

NobodyXu commented 4 weeks ago

If block_on causes it to fail, then maybe you can try tokio::spawn and then block_on the handler?