aws-samples / aws-iot-securetunneling-localproxy

AWS Iot Secure Tunneling local proxy reference C++ implementation
https://docs.aws.amazon.com/iot/latest/developerguide/what-is-secure-tunneling.html
Apache License 2.0
73 stars 69 forks source link

localproxy not working properly with aws-iot-client in v1.0.19 of aws.greengrass.SecureTunneling component #154

Open dbouras opened 3 weeks ago

dbouras commented 3 weeks ago

Describe the bug

After upgrading aws.greengrass.SecureTunneling from v1.0.18 to v1.0.19, using localproxy from MacOS or Linux still connects fine (nothing out of the ordinary in localproxy logs), but when trying to SSH through the tunnel, the SSH connection hangs. When this happens, looking at the logs (file aws.greengrass.SecureTunneling.log) at the other end, every time an SSH connection is initiated  through the localproxy tunnel, I see the following error:

2024-06-21T05:19:56.685Z [INFO] (Copier) aws.greengrass.SecureTunneling: stdout. [ERROR] 2024-06-21 05:19:56.685 [pool-3-thread-3] SubscribeResponseHandler - Secure Tunneling Process: 2024-06-21T05:19:56.684Z [ERROR] {SecureTunnelingContext.cpp}: SecureTunnelingContext::OnSendDataComplete errorCode=13339. {scriptName=services.aws.greengrass.SecureTunneling.lifecycle.run.script, serviceName=aws.greengrass.SecureTunneling, currentState=RUNNING}

I hope someone understands what error code 13339 means.

Interestingly enough, opening an SSH terminal connection from the AWS Web console still works just fine -- the SSH session starts correctly and the above message does not appear.  

To Reproduce

Steps to reproduce the behavior:

  1. deploy aws.greengrass.SecureTunneling v1.0.19
  2. establish a tunnel using localproxy
  3. use ssh to open a terminal connection to the core device
  4. ssh hangs

Expected behavior

ssh successfully establishes a terminal connection

Actual behavior

ssh hangs

Logs

If applicable, add full logs of errors and outputs to help explain your problem. Preferabbly, you can also increase the verbosity, for example to enable debug logs for the localproxy, you can use the cli option -v 6

Environment (please complete the following information):

Additional context

SSH version on Linux:

% ssh -V
OpenSSH_8.4p1, OpenSSL 1.1.1l-fips  24 Aug 2021 SUSE release 150500.17.31.1

SSH version on MacOS:

% ssh -V
OpenSSH_9.7p1, OpenSSL 3.3.0 9 Apr 2024

Rolling back to aws.greengrass.SecureTunneling v1.0.18 and continuing to use localproxy from commit  d3150e0 rectifies the issue.

RogerZhongAWS commented 3 weeks ago

Hi @dbouras , with the v1.0.19 component can you use localproxy commit 9eace7470fbbee00473074f6dc763afdc9e11a4c and also pass in the --destination-client-type V1 arg to your localproxy run command? This is new flag introduced in the latest version to address this issue.

dbouras commented 3 weeks ago

Hi @RogerZhongAWS,

Thank you for the note -- yes, now it works (component version 1.0.19 and localproxy commit 9eace74), but I see a steady stream of messages in the localproxy log (about once every 15 sec, after the first few that are printed when the SSH connection is being established) about reverting to v2 message format -- see below:

[2024-06-21 12:45:37.509917] (0x00000001ee5d8c00) [info] setting source protocol to V1
[2024-06-21 12:45:37.510396] (0x00000001ee5d8c00) [info] Starting proxy in source mode
[2024-06-21 12:45:37.516365] (0x00000001ee5d8c00) [info] Attempting to establish web socket connection with endpoint wss://data.tunneling.iot.eu-west-1.amazonaws.com:443
[2024-06-21 12:45:38.263588] (0x00000001ee5d8c00) [info] Web socket session ID: 0aeddbfffe19ef17-00006a5a-000018c2-9b361bbb73253fa6-ea21f48e
[2024-06-21 12:45:38.263872] (0x00000001ee5d8c00) [info] Successfully established websocket connection with proxy server: wss://data.tunneling.iot.eu-west-1.amazonaws.com:443
[2024-06-21 12:45:38.264313] (0x00000001ee5d8c00) [info] Updated port mapping for v1 format:
[2024-06-21 12:45:38.264393] (0x00000001ee5d8c00) [info] SSH = 8940
[2024-06-21 12:45:38.264471] (0x00000001ee5d8c00) [info] calling setup from loop
[2024-06-21 12:45:38.265490] (0x00000001ee5d8c00) [info] Listening for new connection on port 8940
[2024-06-21 12:45:46.055110] (0x00000001ee5d8c00) [info] Falling back to older protocol, setting new connection id to 0
[2024-06-21 12:45:46.055182] (0x00000001ee5d8c00) [info] creating tcp connection id 0
[2024-06-21 12:45:46.055212] (0x00000001ee5d8c00) [info] Accepted tcp connection on port 8940 from [::1]:56727
[2024-06-21 12:45:47.301184] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:47.301495] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:47.502952] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:47.503201] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:47.886962] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:47.887179] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:51.534885] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:51.535208] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:53.816705] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:53.816844] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:54.161717] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:54.161971] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:57.169536] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:57.169796] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:57.329951] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:57.330253] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:57.499694] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:57.499970] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:57.669897] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:57.670122] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:57.835985] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:57.836247] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:58.036730] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:58.037024] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:58.037230] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:58.037300] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:58.037413] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:45:58.037472] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:46:12.965849] (0x00000001ee5d8c00) [info] reverting to v2 message format
[2024-06-21 12:46:12.966119] (0x00000001ee5d8c00) [info] reverting to v2 message format

Any way to stop it from doing this? (apart from patching the code, I mean)

RogerZhongAWS commented 3 weeks ago

not really, it looks like this log should not be at the info level as it drowns out other more important lines. Will need a code patch to fix.

dbouras commented 2 weeks ago

Since you'll be patching this issue maybe also look at the patch that I've been using for MacOS since last November, contributed via my pull request #145 -- it hasn't been merged so far and no news on when that may be. It's currently stuck on "Review Required" because I have also hanged CMakeLists.txt to statically link OpenSSL.

RogerZhongAWS commented 2 weeks ago

approved and merged your PR, but am curious where you see the added value in statically linking openssl

dbouras commented 2 weeks ago

Thank you @RogerZhongAWS, The reason I opted to statically link OpenSSL is that, in our team, we use localproxy on MacOS (mostly Sonoma 14.x) and Linux (Ubuntu LTS and latest non-LTS versions as well as OpenSUSE Leap 15.x) along with an in-house utility script for setting up and tearing down tunnels to core devices in the field, and I wanted to build two binaries for the two architectures (arm64 and x86_64 respectively) that everyone could use without having to worry about shared object lib compatibility. I hope it makes sense now :)