apache / mina-sshd

Apache MINA sshd is a comprehensive Java library for client- and server-side SSH.
https://mina.apache.org/sshd-project/
Apache License 2.0
892 stars 357 forks source link

Client does not work the same when using the async api as with the synchronous api #426

Open alexmherrmann opened 12 months ago

alexmherrmann commented 12 months ago

Version

2.11.0

Bug description

I have not found any examples of what I am trying to do in official documentation.

I have two tests here (in kotlin):

package minatest
import com.google.common.truth.Truth
import org.apache.sshd.client.SshClient
import org.junit.jupiter.api.MethodOrderer.OrderAnnotation
import org.junit.jupiter.api.Order
import org.junit.jupiter.api.Test
import org.junit.jupiter.api.TestMethodOrder
import org.junit.jupiter.api.Timeout
import org.slf4j.LoggerFactory
import org.springframework.beans.factory.annotation.Autowired
import org.springframework.boot.test.context.SpringBootTest
import org.springframework.test.context.ActiveProfiles
import reactor.kotlin.test.test
import java.time.Duration.ofSeconds
import java.util.concurrent.CountDownLatch
import java.util.concurrent.TimeUnit.SECONDS

    val client = SshClient.setUpDefaultClient().apply {
        start()
    }

    @Test
    fun sanityTest() {
        val instance = requestor.getExistingInstance(existingInstance).block(ofSeconds(5))

        val ipv4 = instance.getPublicIpv4Address()

        val connection = client.connect(
            "root",
            ipv4,
            22
        ).verify(ofSeconds(5))

        val output = connection
            .session
            .run {
                auth().verify(ofSeconds(5))
                executeRemoteCommand("uptime")
            }

        logger.info("Output: $output")

    }

    @Test
    fun secondSanityNowAsync() {
        val instance = requestor
            .getExistingInstance(existingInstance)
            .block(ofSeconds(10))
        val connected = client.connect("root", instance.getPublicIpv4Address(), 22)

        val blocker = CountDownLatch(1)
        connected.addListener { conn ->
            logger.info("Connected: $conn")
            val sesh = conn.session
            val auth = sesh.auth().addListener { authed ->
                conn.verify(0)
                authed.verify(0)
                logger.info("Authed: $authed")
                val output = sesh.executeRemoteCommand("uptime")
                logger.info("Output: $output")
                Truth.assertThat(output).contains("load")
                // NOTIFY OF COMPLETION
                blocker.countDown()
            }
        }

        assert(blocker.await(10, SECONDS))

    }

The synchronous one works just fine. I call verify twice and then I am set up to go. However I'm going for an async architecture and would rather do something like the second as it will eventually be integrated into a project reactor application.

The first one runs just fine, the second one hangs forever on the connected.session.executeRemoteCommand(...). Setting a breakpoint it makes it all the way to the second listener (i.e. I get the "Authed: ..." log message) and then never makes it to the "Output: ".

Actual behavior

The test fails to complete in any amount of time.

Expected behavior

The executeRemoteCommand should execute

Relevant log output

From the run where it hangs:

2023-10-24T14:27:51,113 DEBUG [sshd-SshClient[15fdd1f2]-nio2-thread-9] o.a.s.c.s.AbstractClientSession: createExecChannel(ClientSessionImpl[root@/142.93.54.117:22])[uptime] created id=0 - PTY=null
2023-10-24T14:27:51,113 DEBUG [sshd-SshClient[15fdd1f2]-nio2-thread-9] o.a.s.c.c.AbstractClientChannel: open(ChannelExec[id=0, recipient=-1]-ClientSessionImpl[root@/142.93.54.117:22]) Send SSH_MSG_CHANNEL_OPEN - type=session
2023-10-24T14:27:51,113 DEBUG [sshd-SshClient[15fdd1f2]-nio2-thread-9] o.a.s.c.s.h.AbstractSession: encode(ClientSessionImpl[root@/142.93.54.117:22]) packet #7 sending command=90[SSH_MSG_CHANNEL_OPEN] len=24
2023-10-24T14:27:51,113 DEBUG [sshd-SshClient[15fdd1f2]-nio2-thread-9] o.a.s.c.i.n.Nio2Session: writeBuffer(Nio2Session[local=/192.168.86.247:63735, remote=/142.93.54.117:22]) writing 60 bytes

java.lang.AssertionError: Assertion failed (on the countdownlatch block)

The last few logs from when it runs successfully:

2023-10-24T14:15:39,701 DEBUG [main] o.a.s.c.s.AbstractClientSession: createExecChannel(ClientSessionImpl[root@/142.93.54.117:22])[uptime] created id=0 - PTY=null
2023-10-24T14:15:39,702 DEBUG [main] o.a.s.c.c.AbstractClientChannel: open(ChannelExec[id=0, recipient=-1]-ClientSessionImpl[root@/142.93.54.117:22]) Send SSH_MSG_CHANNEL_OPEN - type=session
2023-10-24T14:15:39,702 DEBUG [main] o.a.s.c.s.h.AbstractSession: encode(ClientSessionImpl[root@/142.93.54.117:22]) packet #7 sending command=90[SSH_MSG_CHANNEL_OPEN] len=24
2023-10-24T14:15:39,702 DEBUG [main] o.a.s.c.i.n.Nio2Session: writeBuffer(Nio2Session[local=/192.168.86.247:63598, remote=/142.93.54.117:22]) writing 60 bytes
2023-10-24T14:15:40,374 DEBUG [sshd-SshClient[15fb7a32]-nio2-thread-11] o.a.s.c.s.h.AbstractSession: doHandleMessage(ClientSessionImpl[root@/142.93.54.117:22]) process #8 SSH_MSG_GLOBAL_REQUEST
2023-10-24T14:15:40,375 DEBUG [sshd-SshClient[15fb7a32]-nio2-thread-11] o.a.s.c.s.h.AbstractConnectionService: globalRequest(ClientConnectionService[ClientSessionImpl[root@/142.93.54.117:22]]) received SSH_MSG_GLOBAL_REQUEST hostkeys-00@openssh.com want-reply=false

Other information

No response

tomaswolf commented 8 months ago

The basic problem here is a mixture of asynchronous mechanisms (futures, and listeners) and then a listener making a synchronous executeRemoteCommand call.

The immediate cause for blocking is that internally a lock is held, which prevents any more incoming messages for that session being handled until that listener is done. "Fixing" this in a transparent and scalable way inside the library looks very difficult.

A quick work-around for your case would be to make that listener use a different thread for everything after logger.info("Authed: $authed").