twilio / breakout-massive-iot-arduino

Apache License 2.0
9 stars 3 forks source link

Issue connecting to AWS IoT's MQTT broker with TLS-only authentication #27

Closed iannerney closed 4 years ago

iannerney commented 4 years ago

Hello! I've submitted a support ticket related to this, but I figured it's probably better to come to the source so others running into AWS configuration issues in the future could reference this.

I'm working to connect the Alfa development kit to AWS's IoT service, but I'm running into issues with their TLS-only authentication requirements.

In config.h I've made the following configuration changes, (1) enabling TLS certificates, (2) changing the port to 8883, and (3) commenting out the username and password...

#define USE_TLS_CERTIFICATES
//#define USE_USERNAME_PASSWORD

#define MQTT_BROKER_HOST "{{unique-endpoint-id}}.iot.us-east-1.amazonaws.com"
// MQTT_BROKER_PORT generally is 1883 for clear-text, 8883 for TLS
#define MQTT_BROKER_PORT 8883 
#define MQTT_KEEP_ALIVE 0
#define MQTT_CLIENT_ID "prototype-1"
#define MQTT_PUBLISH_TOPIC "prototype-1/info"
#define MQTT_STATE_TOPIC "device/state"

#ifdef USE_USERNAME_PASSWORD
//#define MQTT_LOGIN ""
//#define MQTT_PASSWORD ""
#endif

And in tls_credentials.h I've added my certificate and private key generated in AWS's console, and AWS's RSA 2048 root CA...

#define TLS_DEVICE_CERT "" \
"-----BEGIN CERTIFICATE-----" \
"multiple" \
"line" \
"certificate" \
"goes" \
"here==" \
"-----END CERTIFICATE-----"

#define TLS_DEVICE_PKEY "" \
"-----BEGIN RSA PRIVATE KEY-----" \
"multiple" \
"line" \
"private key" \
"goes" \
"here" \
"-----END RSA PRIVATE KEY-----"

#define TLS_SERVER_CA "" \
"-----BEGIN CERTIFICATE-----" \
"AWS" \
"root CA" \
"certificate" \
"goes" \
"here" \
"-----END CERTIFICATE-----"

As a result, I'm seeing the following error code in the serial monitor after network registration completes...

00:03:56.347 WARN mqtt.h:109:mqtt_connect() Failed to connect to MQTT broker, error: -6
00:03:56.347 WARN mqtt.h:67:mqtt_loop() Reconnection failed

I've verified that my keys match, and are properly configured to the device in the AWS console. At this point I'm thinking there's either an issue with TLS-only authentication, or I made some novice formatting or configuration error in my simple code changes above. Hoping for some guidance on possible next steps for resolution.

Thanks, in advance, for your time. I really appreciate it!

OYTIS commented 4 years ago

Hi @iannerney,

the first suggestion that comes to my mind is

s/USE_TLS_CERTIFICATES/USE_TLS_CLIENT_CERTIFICATES/

There seems to be an error in the documentation.

If that doesn't help, we can try getting more logs and/or connect to our MQTT broker.

iannerney commented 4 years ago

Hi @OYTIS, thanks for the quick reply!

It looks like there has been an update to the sample code since SIGNAL, but it is not yet included in the release. I'll update my code from master, make the config changes, and then will report back.

OYTIS commented 4 years ago

Hi @iannerney,

did you already have a chance to try the master branch?

iannerney commented 4 years ago

Hey @OYTIS, thanks for following up on this!

I configured a new sample from master, adding the temp and humidity sensor library, but still no luck with the connection. I'm still seeing the same connection error...

00:00:100:00:33.628 INFO OwlModemSocket.cpp:560:sendTCP() Sent data over TCP on socket 0 30 bytes 00:00:43.428 WARN mqtt.h:109:mqtt_connect() Failed to connect to MQTT broker, error: -6 00:00:43.428 WARN mqtt.h:67:mqtt_loop() Reconnection failed

I also checked the AWS logs, and I don't see any authorization errors or connection attempts using that certificate, so I'm still thinking something is off in my config or the tls credentials.

I'm new to MQTT, but based on AWS's documentation on MQTT, it looks like they might have some variations from other MQTT brokers. Do you see anything in here that might prevent the connection from being established?

I'll try to add some debugging steps in the code to get more verbose error messaging and will report back.

OYTIS commented 4 years ago

Hi @iannerney,

could you please post your updated config.h and tls_credentials.h? (feel free to hide the key and/or certificate, I just want to take a look at the general shape).

OYTIS commented 4 years ago

One thing that just occurred to me is the keep alive interval. Depending on your network conditions it might make sense to set MQTT_KEEP_ALIVE to something between 20 and 200.

Also please make sure your certificate has the right policy attached. An example of working policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "iot:*",
      "Resource": "*"
    }
  ]
}

If nothing helps, there are two things you can do to help us with debugging. To check if everything is alright with your credentials, you can try to connect with mosquitto_pub:

mosquitto_pub --cafile ./ca.pem --cert ./XXXX-certificate.pem.crt --key ./XXXX-private.pem.key -t device/data -m 'Hello!' -h XXXX.XXXX.amazonaws.com -p 8883

Finally, you can give us more insight into what is going on on your device by replacing owl_log_set_level(L_INFO); in your setup() procedure with owl_log_set_level(L_DBG);, and sharing the log (which is going to be pretty large).

iannerney commented 4 years ago

Here is a ZIP of the config.h and tls_credentials.h as I currently have them configured, with the keys and endpoint URL replaced. files.zip

As for your suggestions, I'll go ahead and run through these and will report back shortly. I can say that I have verified the certificate is attached to a working iot:* policy.

Thank you!

iannerney commented 4 years ago

I've set the MQTT_KEEP_ALIVE value to 110, and verified that the policy matched yours.

I then used mosquitto_pub to send a message using the certificate, and I was able to receive the message in my AWS console.

Screen Shot 2019-09-26 at 3 12 16 PM

As for the debug log, here's the output from the start of the process loop, through where the error is displayed. I've included it in a text file, so it's a bit easier to read. debug_log.txt

Please review and let me know if you have any questions. I'm concerned this might just be a formatting error in the tls_credentials.h file, but I reviewed it over again and it looks okay. I also validated the certificates were the same that I used in mosquitto_pub.

Thank you for your help with this! If you'd prefer to see this in real time, we could setup a quick screenshare session. Just let me know what works best for you.

OYTIS commented 4 years ago

Hi @iannerney,

thank you for the detailed info. I can't see anything wrong in the config or TLS credentials. A live session would be great indeed. The only problem is that I am in CET timezone, so we'll have to do in in the (U.S.) morning if you don't mind. Please contact me on agerasimov@twilio.com to schedule a session.

iannerney commented 4 years ago

Sure thing, I'll shoot you an email now to coordinate. Thank you!

OYTIS commented 4 years ago

After the debugging session it turned out that AWS needs keepalive value to be in quite a narrow interval. It worked fine when set to 20 seconds, which is now the default value in our samples. Further investigations on what values are appropriate and why are needed.