StackStorm / st2

StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 160 integration packs with 6000+ actions (see https://exchange.stackstorm.org) and ChatOps. Installer at https://docs.stackstorm.com/install/index.html
https://stackstorm.com/
Apache License 2.0
6.08k stars 748 forks source link

core.remote action throwing an error when ssh connection using private key #5917

Open ciechonp opened 1 year ago

ciechonp commented 1 year ago

SUMMARY

When running core.remote action to ssh to host using private key error is thrown

STACKSTORM VERSION

st2 --version st2 3.7.0, on Python 3.8.10

OS, environment, install method

K8s helm chart installation

Steps to reproduce the problem

st2 run core.remote cmd=whoami hosts=xxx username=stanley private_key=/home/stanley/.ssh/stanley_rsa

Expected Results

This action should run whoami command on remote host and print output.

SSH connection using this private_key is working properly password less when running ssh command from action_runner pod:

ssh -i stanley_rsa stanley@xxx

Exactly the same private_key we have on our old stackstorm instance(st2 3.5dev (596c60c23), on Python 3.6.9) and same core.remote action is able to run successfully. I have also verified that paramiko library has different versions:

Actual Results

st2 run core.remote cmd=whoami hosts=xxx username=stanley private_key=/home/stanley/.ssh/stanley_rsa

.

id: 63f779e3308e12af9365df26

action.ref: core.remote

context.user: st2admin

parameters: 

  cmd: whoami

  hosts: xxx

  private_key: '********'

  username: stanley

status: failed

start_timestamp: Thu, 23 Feb 2023 14:36:19 UTC

end_timestamp: Thu, 23 Feb 2023 14:36:21 UTC

result: 

  error: "Unable to connect to any one of the hosts: ['xxx'].

 connect_errors={

  "xxx": {

    "failed": true,

    "succeeded": false,

    "timeout": false,

    "return_code": 255,

    "stdout": "",

    "stderr": "",

    "error": "Failed connecting to host xxx. q must be exactly 160, 224, or 256 bits long",

    "traceback": "Traceback (most recent call last):\

  File \\"/opt/stackstorm/st2/lib/python3.8/site-packages/st2common/runners/parallel_ssh.py\\", line 278, in _connect\

    client.connect()\

  File \\"/opt/stackstorm/st2/lib/python3.8/site-packages/st2common/runners/paramiko_ssh.py\\", line 171, in connect\

    self.client = self._connect(host=self.hostname, socket=self.bastion_socket)\

  File \\"/opt/stackstorm/st2/lib/python3.8/site-packages/st2common/runners/paramiko_ssh.py\\", line 787, in _connect\

    client.connect(**conninfo)\

  File \\"/opt/stackstorm/st2/lib/python3.8/site-packages/paramiko/client.py\\", line 435, in connect\

    self._auth(\

  File \\"/opt/stackstorm/st2/lib/python3.8/site-packages/paramiko/client.py\\", line 682, in _auth\

    self._transport.auth_publickey(username, key)\

  File \\"/opt/stackstorm/st2/lib/python3.8/site-packages/paramiko/transport.py\\", line 1634, in auth_publickey\

    return self.auth_handler.wait_for_response(my_event)\

  File \\"/opt/stackstorm/st2/lib/python3.8/site-packages/paramiko/auth_handler.py\\", line 244, in wait_for_response\

    raise e\

  File \\"/opt/stackstorm/st2/lib/python3.8/site-packages/paramiko/transport.py\\", line 2163, in run\

    handler(self.auth_handler, m)\

  File \\"/opt/stackstorm/st2/lib/python3.8/site-packages/paramiko/auth_handler.py\\", line 375, in _parse_service_accept\

    sig = self.private_key.sign_ssh_data(blob, algorithm)\

  File \\"/opt/stackstorm/st2/lib/python3.8/site-packages/paramiko/dsskey.py\\", line 109, in sign_ssh_data\

    key = dsa.DSAPrivateNumbers(\

  File \\"/opt/stackstorm/st2/lib/python3.8/site-packages/cryptography/hazmat/primitives/asymmetric/dsa.py\\", line 244, in private_key\

    return backend.load_dsa_private_numbers(self)\

  File \\"/opt/stackstorm/st2/lib/python3.8/site-packages/cryptography/hazmat/backends/openssl/backend.py\\", line 826, in load_dsa_private_numbers\

    dsa._check_dsa_private_numbers(numbers)\

  File \\"/opt/stackstorm/st2/lib/python3.8/site-packages/cryptography/hazmat/primitives/asymmetric/dsa.py\\", line 282, in _check_dsa_private_numbers\

    _check_dsa_parameters(parameters)\

  File \\"/opt/stackstorm/st2/lib/python3.8/site-packages/cryptography/hazmat/primitives/asymmetric/dsa.py\\", line 274, in _check_dsa_parameters\

    raise ValueError(\\"q must be exactly 160, 224, or 256 bits long\\")\

ValueError: q must be exactly 160, 224, or 256 bits long\

"

  }

}"

  traceback: "  File "/opt/stackstorm/st2/lib/python3.8/site-packages/st2actions/container/base.py", line 117, in _do_run

    runner.pre_run()

  File "/opt/stackstorm/st2/lib/python3.8/site-packages/st2common/runners/paramiko_ssh_runner.py", line 206, in pre_run

    self._parallel_ssh_client = ParallelSSHClient(**client_kwargs)

  File "/opt/stackstorm/st2/lib/python3.8/site-packages/st2common/runners/parallel_ssh.py", line 90, in __init__

    connect_results = self.connect(raise_on_any_error=raise_on_any_error)

  File "/opt/stackstorm/st2/lib/python3.8/site-packages/st2common/runners/parallel_ssh.py", line 131, in connect

    raise NoHostsConnectedToException(msg)

"

Any suggestion what needs to be changed and how to make core.remote action work correctly?

porlock commented 1 year ago

same error on my side :/

ciechonp commented 1 year ago

I have downgraded st2 from 3.7 to 3.6 and it starts to work as expected. I tried st2 3.8 with the same errors. I think that clearly indicates that issue is related paramiko python library version that was upgraded in st2 3.7. So there is a workaround, but it is still not working on st2 3.7 and 3.8 so we can't use the latest versions.

amanda11 commented 1 year ago

This sounds like this issue: https://github.com/paramiko/paramiko/issues/2048

It is also mentioned here, and the paramiko fix hasn't been merged yet: https://github.com/fabric/fabric/issues/2182

This also explains that the problem is possibly due to it treating the key as the wrong type. So I don't know if any alternative would be to use a ds a key rather than rsa.

downgrade paramiko to =2.8.1 in stackstorm should fix, but I haven't looked at history and other dependencies to see if that is possible.

amanda11 commented 1 year ago

Looks like we upgraded paramiko just because of dependabot alerts - https://github.com/StackStorm/st2/commit/2dc9d9003d82b63f19ce1b3a577c0b5f99ffa500. Will try and reproduce and see if downgrading paramiko to 2.8.1 resolves issue.

amanda11 commented 1 year ago

Interestingly I couldn't reproduce this problem. Steps taken:

  1. Install single instance ST2 3.8 on rocky
  2. Setup passwordless ssh to remote node using stanley_rsa key. In my case I had stanley on ST2 logging into ubuntu user on another node.
  3. All the following commands all succeeded:
    st2 run core.remote cmd=ls username=ubuntu hosts=xxx private_key=/home/stanley/.ssh/stanley_rsa
    st2 run core.remote cmd=ls username=ubuntu hosts=xxx

This is with putting the newly generated rsa key onto the remote server.

There's some more info on the paramiko error which might explain why working for me: Typically, when the given private key is a RSA key and publickey authentication passes, the issue doesn't occur. However, when the publickey authentication fails using RSAKey, SSHClient is trying to load and authenticate using DSSKey. As the result, the issue occurs.

So I managed to reproduce this error when I put the rsa key into a seperate directory, but didn't setup on the remote server - and then I got the error about the DSA key being reported. But when I then put the public key of that key onto the remote server, then it all worked fine.

So in my testing the error: q must be exactly 160, 224, or 256 bits long is only happening when it fails to connect remotely with the private key. But when the key validation passes then it all works fine.

Can you double-check the location and permissions of files, and that you have matching private/public keys, as I cannot reproduce the problem on a standalone ST2 running 3.8.0 (which is paramiko 2.10.1)