thin-edge / thin-edge.io

The open edge framework for lightweight IoT devices
https://thin-edge.io
Apache License 2.0
218 stars 54 forks source link

Connect Device to AWS returns error and is NOT connected #3067

Open gligorisaev opened 3 weeks ago

gligorisaev commented 3 weeks ago

Describe the bug

Connecting device to AWS following the instructions from:

https://thin-edge.github.io/thin-edge.io/start/connect-aws/#user-context

is returning error:

ERROR: Local MQTT publish has timed out.

To Reproduce

  1. Create the certificate: sudo tedge cert create --device-id AWSthinedge
  2. Register the device on AWS IoT Core
    • Create Policy
    • Register a Device (thing)
    • Configure the device: sudo tedge config set aws.url "us-east-1.console.aws.amazon.com"
    • Connect the device: sudo tedge connect aws
      1. The output looks like:
        
        cmd: ['/bin/sh', '-c', 'sudo tedge connect aws'], exit code: 0
        stdout:
        Checking if systemd is available.

Checking if configuration for requested bridge already exists.

Validating the bridge certificates.

Saving configuration for requested bridge.

Restarting mosquitto service.

Awaiting mosquitto to start. This may take up to 5 seconds.

Enabling mosquitto service on reboots.

Successfully created bridge connection!

Sending packets to check connection. This may take up to 2 seconds.

Connection test failed, attempt 1 of 5

Sending packets to check connection. This may take up to 2 seconds.

Connection test failed, attempt 2 of 5

Sending packets to check connection. This may take up to 2 seconds.

Connection test failed, attempt 3 of 5

Sending packets to check connection. This may take up to 2 seconds.

Connection test failed, attempt 4 of 5

Sending packets to check connection. This may take up to 2 seconds.

Warning: Bridge has been configured, but Aws connection check failed.

Checking if tedge-mapper is installed.

Starting tedge-mapper-aws service.

Persisting tedge-mapper-aws on reboot.

tedge-mapper-aws service successfully started and enabled!

stderr: ERROR: Local MQTT publish has timed out. ERROR: Local MQTT publish has timed out. ERROR: Local MQTT publish has timed out. ERROR: Local MQTT publish has timed out. ERROR: Local MQTT publish has timed out.


**Expected behavior**

Output should look like:

Checking if systemd is available.

Checking if configuration for requested bridge already exists.

Validating the bridge certificates.

Saving configuration for requested bridge.

Restarting mosquitto service.

Awaiting mosquitto to start. This may take up to 5 seconds.

Enabling mosquitto service on reboots.

Successfully created bridge connection!

Sending packets to check connection. This may take up to 2 seconds.

Received expected response on topic aws/connection-success, connection check is successful. Connection check is successful.

Checking if tedge-mapper is installed.

Starting tedge-mapper-aws service.

Persisting tedge-mapper-aws on reboot.

tedge-mapper-aws service successfully started and enabled!



**Environment (please complete the following information):**
 - OS [incl. version]: `Debian 12.6`
 - Hardware [incl. revision]: `aarch64 and Docker`
 - System-Architecture [e.g. result of "uname -a"]: `Linux raspberrypi 6.6.31+rpt-rpi-v8 #1 SMP PREEMPT Debian 1:6.6.31-1+rpt1 (2024-05-29) aarch64 GNU/Linux`
 - thin-edge.io version [e.g. 0.1.0]: `1.2.0-16-g85426fca51`

**Additional context**
<!-- Add any other context about the problem here. -->
gligorisaev commented 3 weeks ago

UPDATE

After detailed check, the connection is at this point NOT established

reubenmiller commented 2 weeks ago

@gligorisaev I would check the following things (as I suspect it is just a configuration problem):

  1. Check that you are using the correct AWS url, as it should look something like: abcdef12345-ats.iot.us-east-1.amazonaws.com, not the *console* url that you posted. If the URL is wrong, then you will also have to change the AWS policy that you uploaded as the policy references the same URL (multiple times)
  2. Double check that you don't have an existing certificate that has been registered with the same certificate Common Name. If the AWS policy was not set up correctly on the first connection, then sometimes the certificate exists, but the AWS Thing does not, and I've found that you have to delete the existing certificate and retry the connection before you can connect successfully (but this advice is just based on personal experience so your mileage might vary)
reubenmiller commented 1 day ago

@gligorisaev can we close this issue? I suspect it is just a configuration error. Beside that, we have some test coverage coming in #3097

gligorisaev commented 1 day ago

@gligorisaev can we close this issue? I suspect it is just a configuration error. Beside that, we have some test coverage coming in #3097

fine with me