StackStorm / st2

StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 160 integration packs with 6000+ actions (see https://exchange.stackstorm.org) and ChatOps. Installer at https://docs.stackstorm.com/install/index.html
https://stackstorm.com/
Apache License 2.0
6.08k stars 747 forks source link

core.remote action doesn't always capture stdout and truncates stderr #3975

Open emptywee opened 6 years ago

emptywee commented 6 years ago

After upgrade to 2.6.0 I ran into a weird issue with core.remote actions.

Here's the details of the "faulty" execution (I masked internal-only domain with example.com):

$ st2 execution get 5a6f9c1ebd00ef78ff5fec97 -d
+-----------------+--------------------------------------------------------------+
| Property        | Value                                                        |
+-----------------+--------------------------------------------------------------+
| id              | 5a6f9c1ebd00ef78ff5fec97                                     |
| action.ref      | core.remote                                                  |
| context.user    | st2admin                                                     |
| parameters      | {                                                            |
|                 |     "cmd": "curl  -svL -w '%{http_code}'                     |
|                 | https://nexus3.example.com; >&2 echo " - something added      |
|                 | manually"",                                                  |
|                 |     "hosts": "127.0.0.1"                                     |
|                 | }                                                            |
| status          | succeeded (2s elapsed)                                       |
| start_timestamp | Mon, 29 Jan 2018 16:11:42 CST                                |
| end_timestamp   | Mon, 29 Jan 2018 16:11:44 CST                                |
| result          | {                                                            |
|                 |     "127.0.0.1": {                                           |
|                 |         "failed": false,                                     |
|                 |         "stderr": "* About to connect() to nexus3.example.com |
|                 | port 443 (#0)                                                |
|                 | *   Trying 10.143.16.34... connected                         |
|                 | * Connected to nexus3.example.com (10.143.16.34) port 443     |
|                 | (#0)                                                         |
|                 | * Initializing NSS with certpath: sql:/etc/pki/nssdb         |
|                 | *   CAfile: /etc/pki/tls/certs/ca-bundle.crt                 |
|                 |   CApath: none                                               |
|                 | * SSL connection using TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA    |
|                 | * Server certificate:                                        |
|                 | * \tsubject: CN=*.example.com,OU=Domain Control Validated     |
|                 | * \tstart date: Nov 10 14:21:01 2017 GMT                     |
|                 | * \texpire date: Nov 10 14:21:01 2020 GMT                    |
|                 | * \tcommon name: *.example.com                                |
|                 | * \tissuer: CN=Go Daddy Secure Certificate Authority -       |
|                 | G2,OU=http://certs.godaddy.com/repository/,O="GoDaddy.com,   |
|                 | Inc.",L=Scottsdale,ST=Arizona,C=US                           |
|                 | > GET / HTTP/1.1                                             |
|                 | > User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu)          |
|                 | libcurl/7.19.7 NSS/3.27.1 zlib/1.2.3 libidn/1.18             |
|                 | libssh2/1.4.2                                                |
|                 | > Host: nexus3.example.com                                    |
|                 | > Accept: */*                                                |
|                 | > ",                                                         |
|                 |         "return_code": 0,                                    |
|                 |         "succeeded": true,                                   |
|                 |         "stdout": ""                                         |
|                 |     }                                                        |
|                 | }                                                            |
| liveaction      | {                                                            |
|                 |     "runner_info": {                                         |
|                 |         "hostname": "lvcops101.example.com",                  |
|                 |         "pid": 30775                                         |
|                 |     },                                                       |
|                 |     "parameters": {                                          |
|                 |         "cmd": "curl  -svL -w '%{http_code}'                 |
|                 | https://nexus3.example.com; >&2 echo " - something added      |
|                 | manually"",                                                  |
|                 |         "hosts": "127.0.0.1"                                 |
|                 |     },                                                       |
|                 |     "action_is_workflow": false,                             |
|                 |     "callback": {},                                          |
|                 |     "action": "core.remote",                                 |
|                 |     "id": "5a6f9c1ebd00ef78ff5fec96"                         |
|                 | }                                                            |
+-----------------+--------------------------------------------------------------+

Expected stdout with a http code and stderr ending with - something added manually, but it is not the case. This can happen 4-5 times in a row and then work as expected 4-5 times. So it is intermittent and reason is unknown. Something with paramiko ssh runner? If I remove -v from the flags to curl, it feels like everything's fine in 100% runs. Arekhi on Slack was able (sorta) to reproduce it. I am filing this issue so someone can take a look at it and possibly fix. We have some workflows that rely on http code from curl, they became unstable, because there's no stdout from time to time now. I removed -v to alleviate it, but I'd like to have some sort of http log recorded for audit and troubleshooting purposes.

LindsayHill commented 6 years ago

You still seeing this?

emptywee commented 6 years ago

@LindsayHill need to re-test to confirm if it's gone in the latest version.

stale[bot] commented 5 years ago

Thanks for contributing to this issue. As it has been 90 days since the last activity, we are automatically marking is as stale. If this issue is not relevant or applicable anymore (problem has been fixed in a new version or similar), please close the issue or let us know so we can close it. On the contrary, if the issue is still relevant, there is nothing you need to do, but if you have any additional details or context which would help us when working on this issue, please include it as a comment to this issue.