labgrid-project / labgrid

Embedded systems control library for development, testing and installation
https://labgrid.readthedocs.io/
Other
332 stars 174 forks source link

SSH master start command returns non-zero value #1087

Open pfiser opened 1 year ago

pfiser commented 1 year ago

Hi,

I have issues with labgrid exporters build with Yocto.

Creating SSH ControlSocket for those machines returns non-zero value and this in turn produces the following error:

$ labgrid-client -p regnum-rack video -q low
WARNING: Ticket authentication is deprecated. Please update your coordinator.
Traceback (most recent call last):
  File "/home/prix/labgrid/venv/lib/python3.8/site-packages/labgrid/remote/client.py", line 1785, in main
    args.func(session)
  File "/home/prix/labgrid/venv/lib/python3.8/site-packages/labgrid/remote/client.py", line 1102, in video
    drv.stream(quality, controls=controls)
  File "/home/prix/labgrid/venv/lib/python3.8/site-packages/labgrid/binding.py", line 96, in wrapper
    return func(self, *_args, **_kwargs)
  File "/home/prix/labgrid/venv/lib/python3.8/site-packages/labgrid/driver/usbvideodriver.py", line 133, in stream
    tx_cmd = self.video.command_prefix + ["gst-launch-1.0", "-q"]
  File "/home/prix/labgrid/venv/lib/python3.8/site-packages/labgrid/resource/common.py", line 86, in command_prefix
    conn = sshmanager.get(host)
  File "/home/prix/labgrid/venv/lib/python3.8/site-packages/labgrid/util/ssh.py", line 50, in get
    instance.connect()
  File "/home/prix/labgrid/venv/lib/python3.8/site-packages/labgrid/util/ssh.py", line 399, in connect
    self._open_connection()
  File "/home/prix/labgrid/venv/lib/python3.8/site-packages/labgrid/util/ssh.py", line 174, in _open_connection
    self._start_own_master()
  File "/home/prix/labgrid/venv/lib/python3.8/site-packages/labgrid/util/ssh.py", line 457, in _start_own_master
    raise ExecutionError(
labgrid.driver.exception.ExecutionError: failed to connect to root@192.168.69.51 with args ['ssh', '-x', '-o', 'LogLevel=ERROR', '-o', 'PasswordAuthentication=no', '-n', '-MN', '-o', 'ConnectTimeout=30', '-o', 'ControlPersist=300', '-o', 'ControlMaster=yes', '-o', 'ControlPath=/tmp/lg-con-ud3o_03l/control-root@192.168.69.51', '-o', 'StrictHostKeyChecking=yes', 'root@192.168.69.51'], returncode=2 b'',b''

However if I change labgrid/utils/ssh.py:

diff --git a/labgrid/util/ssh.py b/labgrid/util/ssh.py
index fac4ede4b7b5..bafda59413e5 100644
--- a/labgrid/util/ssh.py
+++ b/labgrid/util/ssh.py
@@ -452,7 +452,7 @@ class SSHConnection:
         )

         try:
-            if self._master.wait(timeout=connect_timeout) != 0:
+            if self._master.wait(timeout=connect_timeout) == 255:
                 stdout, stderr = self._master.communicate()
                 raise ExecutionError(
                     f"failed to connect to {self.host} with args {args}, returncode={self._master.returncode} {stdout},{stderr}"  # pylint: disable=line-too-long

All works, meaning SSH ControlSocket is created and video starts playing!

According to man ssh:

ssh exits with the exit status of the remote command or with 255 if an error occurred.

hence the proposed change in ssh.py.

I then tried to manually create SSH ControlSocket independently of labgrid:

$ ssh -x -o LogLevel=ERROR -o PasswordAuthentication=no -n -MN -o ConnectTimeout=30 -o ControlPersist=30 -o ControlMaster=yes -o ControlPath=/tmp/test -o StrictHostKeyChecking=yes root@192.168.69.51
$ echo $?
2
$

And the return value is indeed 2. Then I tried another board built with yocto and the return value was 1 :)

Both boards have root login enabled without password! (I also tried setting up user with password on those boards but return values were always the same/non-zero).

Finally I tried with another machine running Ubuntu 22.04 and return value was 0.

So my questions: 1) Did anyone experience the same thing with Yocto built exporters? 2) Any idea why this happens? 2) Shall we check for ssh return value of 255 instead of different than non-zero as proposed?

Emantor commented 1 year ago

Which ssh server are you using on Yocto? This may be an issue when dropbear is used instead of ssh-server-openssh. We explicitly use openssh for the LXA TAC yocto layer.

pfiser commented 1 year ago

No, we are using ssh-server-openssh with Yocto (kirkstone).

SSH Version:

$ telnet 192.168.69.51 22
Trying 192.168.69.51...
Connected to 192.168.69.51.
Escape character is '^]'.
SSH-2.0-OpenSSH_8.9

or

root@phyboard-mira-imx6-5:~# sshd -V
unknown option -- V
OpenSSH_8.9p1, OpenSSL 3.0.7 1 Nov 2022
Bastian-Krause commented 1 year ago

Does your manually run ssh command output useful information if you lower the LogLevel?