awslabs / aws-greengrass-labs-containerized-secure-tunneling

MIT No Attribution
5 stars 3 forks source link

Fedora IoT + Podman #3

Closed MarcoRBosco closed 1 year ago

MarcoRBosco commented 1 year ago

We are trying to use the container with following setup:

After deploying the container with some minor tweaks in the recipe.yaml in order to use the container with podman:

---
RecipeFormatVersion: '2020-01-25'
ComponentName: aws.greengrass.labs.CustomSecureTunneling
ComponentVersion: '1.0.14'
ComponentDescription: A component for Secure Tunneling in a customized container.
ComponentPublisher: Amazon
ComponentDependencies:
  aws.greengrass.DockerApplicationManager:
    VersionRequirement: ~2.0.0
  aws.greengrass.TokenExchangeService:
    VersionRequirement: ~2.0.0
ComponentConfiguration:
  DefaultConfiguration:
    accessControl:
      aws.greengrass.ipc.mqttproxy:
        aws.greengrass.labs.CustomSecureTunneling:mqttproxy:1:
          policyDescription: Allows access to subscribe to new Secure Tunneling notifications.
          operations:
            - "aws.greengrass#SubscribeToIoTCore"
          resources:
            - "$aws/things/+/tunnels/notify"
Manifests:
  - Platform:
      os: linux
    Lifecycle:
      Install:
        RequiresPrivilege: true
        Script: |-
          podman load -i /var/home/gcc_user/greengrass/v2/packages/artifacts/aws.greengrass.labs.CustomSecureTunneling/1.0.14/image.tar.gz
      Run:
        RequiresPrivilege: true
        Script: |-
          podman run \
                --name=greengrass_custom_secure_tunneling \
                --network=host \
                --restart=always \
                --privileged \
                -e AWS_REGION \
                -e AWS_IOT_THING_NAME \
                -e AWS_GG_NUCLEUS_DOMAIN_SOCKET_FILEPATH_FOR_COMPONENT \
                -e SVCUID \
                -v {kernel:rootPath}/ipc.socket:{kernel:rootPath}/ipc.socket \
                localhost/greengrass_custom_secure_tunneling:1.0.14
    Artifacts:
      - URI: s3://--------------------/aws.greengrass.labs.CustomSecureTunneling/1.0.14/image.tar.gz

Once is deployed we open the tunnel getting the following logs of the container:

New tunnel event received...

Starting aws-iot-device-client...

     /app/aws-iot-device-client --enable-tunneling true --tunneling-region eu-central-1 --tunneling-service SSH --endpoint data.tunneling.iot.eu-central-1.amazonaws.com --tunneling-disable-notification --config-file dummy_config.json --log-level DEBUG

2023-02-16T13:16:33.183Z [WARN]  {FileUtils.cpp}: Permissions to given file/dir path './' is not set to recommended value... {Permissions: {desired: 745, actual: 755}}�

2023-02-16T13:16:33.183Z [INFO]  {Config.cpp}: Successfully fetched JSON config file: {

                    "endpoint": "not_needed_see_argv",

                    "cert": "not_needed_see_argv",

                    "key": "not_needed_see_argv",

                    "root-ca": "not_needed_see_argv",

                    "thing-name": "not_needed_see_argv"

                }�

2023-02-16T13:16:33.183Z [DEBUG] {Config.cpp}: Did not find a runtime configuration file, assuming Fleet Provisioning has not run for this device�

2023-02-16T13:16:33.183Z [DEBUG] {EnvUtils.cpp}: Updated PATH environment variable to: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/root/.aws-iot-device-client:/root/.aws-iot-device-client/jobs:/app:/app/jobs�

2023-02-16T13:16:33.183Z [INFO]  {Main.cpp}: Now running AWS IoT Device Client version v1.5.103-b6080e5�

Then we create a tunnel and start the local proxy in another machine /localproxy -r eu-central-1 -s 5555 -t source-client-access-token. We can open to source connection but not to the destination.

When we try to start the ssh session with ssh username@localhost -p 5555 we get the following oputput of the proxy and the SSH session never starts:

creating tcp connection id 1
[2023-02-16T16:22:28.535158]{47629}[info]    Accepted tcp connection on port 5555 from [::1]:34534
[2023-02-16T16:22:28.535181]{47629}[debug]   Sending stream start, setting new stream ID to: 1, service id: SSH
[2023-02-16T16:22:28.535197]{47629}[trace]   Sending messages over web socket for service id: SSH connection id: 1
[2023-02-16T16:22:28.535206]{47629}[trace]   Current queue size: 0
[2023-02-16T16:22:28.535217]{47629}[trace]   Put data 13 bytes into the web_socket_outgoing_message_queue for service id: SSH connection id: 1
[2023-02-16T16:22:28.535232]{47629}[trace]   Getting tcp connection with id: 1
[2023-02-16T16:22:28.535247]{47629}[trace]   num active connections for service id SSH: 1
[2023-02-16T16:22:28.535257]{47629}[trace]   Calling async_write with type: websocket_stream_single_ssl_type
[2023-02-16T16:22:28.535337]{47629}[trace]   Sent 13 bytes over websocket for service id: SSH connection id: 1
[2023-02-16T16:22:28.535351]{47629}[trace]   capturing after_send_message
[2023-02-16T16:22:28.535358]{47629}[trace]   Setting up bi-directional data transfer for service id: SSH connection id: 1
[2023-02-16T16:22:28.535367]{47629}[trace]   Getting tcp connection with id: 1
[2023-02-16T16:22:28.535375]{47629}[trace]   num active connections for service id SSH: 1
[2023-02-16T16:22:28.535389]{47629}[trace]   Clearing tcp connection buffers
[2023-02-16T16:22:28.535401]{47629}[debug]   Starting web socket read loop while web socket is already reading. Ignoring...
[2023-02-16T16:22:28.535409]{47629}[trace]   Begin tcp socket read loop for service id : SSH connection id : 1
[2023-02-16T16:22:28.535417]{47629}[trace]   Getting tcp connection with id: 1
[2023-02-16T16:22:28.535425]{47629}[trace]   num active connections for service id SSH: 1
[2023-02-16T16:22:28.535433]{47629}[trace]   Initiating tcp socket read
[2023-02-16T16:22:28.535452]{47629}[trace]   web_socket_outgoing_message_queue is empty, no more messages to send.
[2023-02-16T16:22:28.535639]{47629}[trace]   Handling read from tcp socket for service id SSH connection id 1
[2023-02-16T16:22:28.535659]{47629}[trace]   Getting tcp connection with id: 1
[2023-02-16T16:22:28.535688]{47629}[trace]   num active connections for service id SSH: 1
[2023-02-16T16:22:28.535707]{47629}[trace]   TCP socket read 21 bytes for service id: SSH, connection id: 1
[2023-02-16T16:22:28.535744]{47629}[trace]   Begin tcp socket read loop for service id : SSH connection id : 1
[2023-02-16T16:22:28.535764]{47629}[trace]   Getting tcp connection with id: 1
[2023-02-16T16:22:28.535780]{47629}[trace]   num active connections for service id SSH: 1
[2023-02-16T16:22:28.535801]{47629}[trace]   Initiating tcp socket read
[2023-02-16T16:22:28.535835]{47629}[trace]   Web socket write buffer drain for service id: SSH, connection id: 1
[2023-02-16T16:22:28.535854]{47629}[trace]   Getting tcp connection with id: 1
[2023-02-16T16:22:28.535870]{47629}[trace]   num active connections for service id SSH: 1
[2023-02-16T16:22:28.535895]{47629}[debug]   Prepare to send data message: service id: SSH stream id: 1 connection id: 1
[2023-02-16T16:22:28.535934]{47629}[trace]   Sending messages over web socket for service id: SSH connection id: 1
[2023-02-16T16:22:28.535953]{47629}[trace]   Current queue size: 0
[2023-02-16T16:22:28.535974]{47629}[trace]   Put data 36 bytes into the web_socket_outgoing_message_queue for service id: SSH connection id: 1
[2023-02-16T16:22:28.536035]{47629}[trace]   Getting tcp connection with id: 1
[2023-02-16T16:22:28.536056]{47629}[trace]   num active connections for service id SSH: 1
[2023-02-16T16:22:28.536075]{47629}[trace]   Calling async_write with type: websocket_stream_single_ssl_type
[2023-02-16T16:22:28.536137]{47629}[debug]   Write buffer has enough space, continue tcp read loop for SSH connection id: 1
[2023-02-16T16:22:28.536162]{47629}[trace]   Begin tcp socket read loop for service id : SSH connection id : 1
[2023-02-16T16:22:28.536182]{47629}[trace]   Getting tcp connection with id: 1
[2023-02-16T16:22:28.536213]{47629}[trace]   num active connections for service id SSH: 1
[2023-02-16T16:22:28.536232]{47629}[debug]   Not starting TCP read loop, socket is already reading
[2023-02-16T16:22:28.536256]{47629}[trace]   Sent 36 bytes over websocket for service id: SSH connection id: 1
[2023-02-16T16:22:28.536285]{47629}[trace]   capturing after_send_message
[2023-02-16T16:22:28.536307]{47629}[trace]   Web socket write buffer drain for service id: SSH, connection id: 1
[2023-02-16T16:22:28.536324]{47629}[trace]   Getting tcp connection with id: 1
[2023-02-16T16:22:28.536341]{47629}[trace]   num active connections for service id SSH: 1
[2023-02-16T16:22:28.536364]{47629}[debug]   not writing, no buffer contents, skip straight to being done draining
[2023-02-16T16:22:28.536390]{47629}[trace]   web_socket_outgoing_message_queue is empty, no more messages to send.
[2023-02-16T16:22:35.063916]{47629}[trace]   Sent ping data: 1676560955063
[2023-02-16T16:22:35.063976]{47629}[trace]   Calling async_ping with type: websocket_stream_single_ssl_type
[2023-02-16T16:22:55.064207]{47629}[trace]   Sent ping data: 1676560975064
[2023-02-16T16:22:55.064263]{47629}[trace]   Calling async_ping with type: websocket_stream_single_ssl_type
[2023-02-16T16:23:15.064437]{47629}[trace]   Sent ping data: 1676560995064
[2023-02-16T16:23:15.064493]{47629}[trace]   Calling async_ping with type: websocket_stream_single_ssl_type
[2023-02-16T16:23:35.064669]{47629}[trace]   Sent ping data: 1676561015064
[2023-02-16T16:23:35.064726]{47629}[trace]   Calling async_ping with type: websocket_stream_single_ssl_type
[2023-02-16T16:23:51.764803]{47629}[trace]   Handling read from tcp socket for service id SSH connection id 1
[2023-02-16T16:23:51.764883]{47629}[trace]   Getting tcp connection with id: 1
[2023-02-16T16:23:51.764904]{47629}[trace]   num active connections for service id SSH: 1
[2023-02-16T16:23:51.764923]{47629}[trace]   received error code: asio.misc:2
[2023-02-16T16:23:51.764941]{47629}[debug]   Handling tcp socket error for service id: SSH connection id: 1. error message: End of file
[2023-02-16T16:23:51.764963]{47629}[trace]   Getting tcp connection with id: 1
[2023-02-16T16:23:51.765007]{47629}[trace]   num active connections for service id SSH: 1
[2023-02-16T16:23:51.765028]{47629}[info]    Disconnected from: [::1]:34534

We cannot see what is blocking to start the SSH session.

Kriechi commented 1 year ago

Which version or build of localproxy are you using?

Are you creating the new tunnel via CLI -- or can you try the Web Console flow with the integrated SSH terminal in the browser?

Comparing your component output running the AWS IoT Device Client version, I am not certain it fully started, as your output is missing a few log lines from the sample output in the README. Are you using Fedora as host OS, or did you also build the container using Fedora as base image layer?

MarcoRBosco commented 1 year ago

Which version or build of localproxy are you using? v3.0.2

Are you creating the new tunnel via CLI -- or can you try the Web Console flow with the integrated SSH terminal in the browser?

We also tried the integrated SSH terminal in the browser but it stucks in connect without throwing any error (as it is a test the private key doesn't have a key passphrase):

AWS-SSH

Are you using Fedora as host OS, or did you also build the container using Fedora as base image layer?

We are using Fedora IoT as a host. We didn't modifie the docker file so we didn't change the base image. We made minor changes in build-custom.sh:

 #!/usr/bin/env bash

if [ $# -ne 3 ]; then
  echo 1>&2 "Usage: $0 IMAGE-NAME COMPONENT-NAME COMPONENT-VERSION"
  exit 3
fi

IMAGE_NAME=$1
COMPONENT_NAME=$2
VERSION=$3

cp recipe.yaml ./greengrass-build/recipes

podman build -t $IMAGE_NAME:$VERSION src/
podman save --output ./greengrass-build/artifacts/$COMPONENT_NAME/$VERSION/image.tar.gz $IMAGE_NAME:$VERSIO

As mentioned the docker file is mantained unmodified.

Kriechi commented 1 year ago

Thanks @MarcoRBosco !

I've pushed compatibility improvements to a new branch: https://github.com/awslabs/aws-greengrass-labs-containerized-secure-tunneling/tree/compatibility-improvements Please let me know if this works for you.

I successfully connected using the Web Console SSH feature, as well as the most recent stable release of localproxy.

MarcoRBosco commented 1 year ago

Thank you for the changes @Kriechi !

We tested the new branch and we had an issue to publish the component:

[2023-02-19 11:17:45] ERROR - Failed to publish new version of the component 'aws.greengrass.labs.CustomSecureTunneling'

=============================== ERROR ===============================
Could not publish the component due to the following error.
Failed to publish new version of component with the given configuration.
[Errno 2] No such file or directory: '/home/engapplic/aws-greengrass-labs-containerized-secure-tunneling/greengrass-build/recipes/recipe.yaml'

This is due to this change commit c47858b2fa850531fcb6726e275ec7bcab6fb773 so we kept the build-custom.sh as the previous version and the component publish worked.

We tested the new container in our gateway testing machine and with localproxy v3.0.2 worked as a charm, but in the web console interface still gets blocked in connect.

We also have two minor problems but we think is related with the use of podman insted of docker we still need to check them:

Kriechi commented 1 year ago

@MarcoRBosco I pushed a re-write of the last commit to correctly fix the build script. Please let me know if the build process using gdk is working for you now - you might have to clean up the greengrass-build directory once.

I got the web console interface working with password authentication only at the moment - though that should not be an issue of the component itself. If it works with the stand-alone localproxy, the component itself is working fine.

Regarding the deployment state: is your deployment healthy or showing an error? Is this a new problem with the branch, or also happening on the main branch?

MarcoRBosco commented 1 year ago

@Kriechi with this new push the build process is working perfectly.

Regarding the deployment state: is your deployment healthy or showing an error? Healthy

Is this a new problem with the branch, or also happening on the main branch? It started with the new branch. But to be honest I made a lot of deployments revisions (>20) with this component. I need to make one last test to clean the whole setup and restart the testing process. But I understand that the component shoud be listed in the thing components in the AWS IoT Console.

The greengrass log doesn't throw any error. What is strange that the deployment finished well but the last update of the thing is 4 days ago when it should get updated with the deployment.

Kriechi commented 1 year ago

@MarcoRBosco you should see the component status in your AWS Console when viewing the GGv2 Core device. The aws.greengrass.labs.CustomSecureTunneling component should be in Running state, as it is a long-running process which should not exit or stop:

Screenshot 2023-02-22 at 13 37 40

State is reported periodically or during events, see https://docs.aws.amazon.com/greengrass/v2/developerguide/device-status.html

If you could please validate and confirm that the changes on the branch solve the issues you reported, I will merge it to the main branch and resolve this issue. Thanks!

MarcoRBosco commented 1 year ago

Hello @Kriechi,

Sorry, due to schedule problems I couldn't test that the component appears in AWS Console.

We hope we can re-engage with the tests as soon as possible.

I’ll keep you updated if the issue with the AWS console is resolved.