aws-samples / aws-iot-securetunneling-localproxy

AWS Iot Secure Tunneling local proxy reference C++ implementation
https://docs.aws.amazon.com/iot/latest/developerguide/what-is-secure-tunneling.html
Apache License 2.0
74 stars 71 forks source link

Crash when using systemd socket activation? (fatal: _Map_base::at) #116

Open MartinLoeper opened 1 year ago

MartinLoeper commented 1 year ago

Describe the bug

I am building a yocto distribution using AWS Greengrass V2 and AWS Secure-Tunneling by using the latest recipes from meta-aws. My tunnel is configured to connect to an SSH daemon on port 22. The daemon is started using systemd's sshd.socket activation (OpenSSH Per-Connection Daemon).

I am consistently experiencing that the first connection fails with the following log messages:

[458]: [2023-01-23T03:33:05.000684]{472}[debug]   v2 local proxy starts with v1 local proxy format
[458]: [2023-01-23T03:33:05.004847]{472}[debug]   //config does not exist!
[458]: [2023-01-23T03:33:05.011844]{472}[info]    Starting proxy in destination mode
[458]: [2023-01-23T03:33:05.037929]{472}[info]    Attempting to establish web socket connection with endpoint wss://data.tunneling.iot.eu-c>
[458]: [2023-01-23T03:33:05.245399]{472}[debug]   Resolved proxy server IP: 3.72.30.24
[458]: [2023-01-23T03:33:05.297022]{472}[debug]   Connected successfully with proxy server
[458]: [2023-01-23T03:33:05.461362]{472}[debug]   Successfully completed SSL handshake with proxy server
[458]: [2023-01-23T03:33:05.599566]{472}[info]    Web socket session ID: 0a57ecfffe7ba5f8-000014f3-000adba6-cec223a60075677f-b210595d
[458]: [2023-01-23T03:33:05.599716]{472}[debug]   Web socket subprotocol selected: aws.iot.securetunneling-3.0
[458]: [2023-01-23T03:33:05.599765]{472}[info]    Successfully established websocket connection with proxy server: wss://data.tunneling.iot>
[458]: [2023-01-23T03:33:05.599838]{472}[debug]   Seting up web socket pings for every 20000 milliseconds
[458]: [2023-01-23T03:33:05.600185]{472}[debug]   Scheduled next read:
[458]: [2023-01-23T03:33:05.604503]{472}[debug]   Extracting service Ids from control message 5
[458]: [2023-01-23T03:33:05.604645]{472}[info]    Updated port mapping for v1 format:
[458]: [2023-01-23T03:33:05.604692]{472}[info]    SSH-WWAN = localhost:22
[458]: [2023-01-23T03:33:05.604776]{472}[debug]   Starting web socket read loop continue reading...
[458]: [2023-01-23T03:33:05.604879]{472}[debug]   Starting web socket read loop while web socket is already reading. Ignoring...
[458]: [2023-01-23T03:33:05.619408]{472}[fatal]   _Map_base::at

Right afterwards the destination is disconnected. The SSH daemon logs the following:

[   58.014887] run-iot-agent-from-systemd.sh[458]: [2023-01-23T03:33:05.619408]{472}[fatal]   _Map_base::at
[   58.015814] run-iot-agent-from-systemd.sh[458]: Destination tunnel closed.
[   76.232099] run-iot-agent-from-systemd.sh[458]: Opening tunnel...
[   76.684440] systemd[1]: Created slice Slice /system/sshd.
[   76.690603] systemd[1]: Started OpenSSH Per-Connection Daemon ([::1]:56064).
[   76.784562] sshd[478]: ssh_dispatch_run_fatal: Connection from ::1 port 56064: Broken pipe [preauth]
[   76.788446] systemd[1]: sshd@0-::1:22-::1:56064.service: Deactivated successfully.
[   77.505598] systemd[1]: Started OpenSSH Per-Connection Daemon ([::1]:56066).
[   77.568081] run-iot-agent-from-systemd.sh[458]: [2023-01-23T03:33:23.875331]{475}[debug]   v2 local proxy starts with v1 local proxy format

It looks like for the first connection attempt no per-connection daemon process exists and the localproxy crashes whereas for the second attempt the sshd process is started timely and localproxy is able to establish the connection.

Has anybody an idea what might be going on here? Is localproxy supposed to work with systemd socket activation?

Logs

see above

Environment (please complete the following information):

RogerZhongAWS commented 1 year ago

Hello, thanks for posting those logs. I did some testing with socket activation enabled and while I was not able to get the exact same error message, I have experienced similar behavior where it takes a second connection attempt for the ssh session to start working. This is most definitely a bug we want to fix, since there is no reason why we should not be able to support systemd socket activation.