espressif / esp-aws-iot

AWS IoT SDK for ESP32 based chipsets
Apache License 2.0
266 stars 157 forks source link

High percentage of failures with xTlsConnect (CA-249) #146

Open mazdacardinal opened 1 year ago

mazdacardinal commented 1 year ago

Hi, I'm seeing a large number of failures when trying out demos that use this library on the tls connection:

xTlsRet = xTlsConnect( pxNetworkContext );

I'm using the demo from this repo: https://github.com/FreeRTOS/iot-reference-esp32c3 and setting the pcServerRootCAPem, pcClientCertPem and pcClientKeyPem settings in the network context. Somewhere around 1 in 8 device start ups it works perfectly. Any other time I restart the device it just loops endlessly trying to connect every few seconds, returning a status of TLS_TRANSPORT_CONNECT_FAILURE after each attempt. It seems that if it fails a couple of times it will never work and will just fail every single time until I restart the board to try again. I wasn't sure if there might be a timeout or some other setting I could tweak. Thanks for any advice.

SolidStateLEDLighting commented 1 year ago

I use the call to xTlsConnect in my IOT project and I have no issues with it. The only thing that strikes me strange here is that I'm wondering if somehow your SNTP is not getting the job done? Getting a response from a time server takes a few seconds... for me, all the way around the world, sometimes 6 or 7 seconds. You must have the date/time set in the ESP before you can attempt to log in to AWS. I suggest you print out the time before the login is attempted just to be sure that area is ok.

mazdacardinal commented 1 year ago

It doesn't seem to matter if the clock is set or not. It can connect either way but still has a high degree of failure in both cases. The time spit out during each connection attempt is the current time in UTC. Im setting the clock after wifi initialization with:

    sntp_setoperatingmode(SNTP_OPMODE_POLL);
    sntp_setservername(0, "time.google.com");
    sntp_init();
SolidStateLEDLighting commented 1 year ago

The time is important because without it, you can't negotiate with AWS correctly. Your algorithm here may set up the clock, but a callback occurs a few second later when the time arrives. If you charge off to make an AWS connection without waiting for the SNTP callback, then you have a problem. I would think, however, that the example code does this correctly behind the scenes.

Here is a snippet of what I do below. I also have 4 time servers that I rotate over in case one happens to be off line at any given time.

sntp_stop(); // We can crash of we try to set an operating mode while the client is running.... stop first to be sure all will be ok. sntp_setoperatingmode(SNTP_OPMODE_POLL); sntp_setservername(0, "server.name"); sntp_set_time_sync_notification_cb(timeSyncNotificationCallback); sntp_set_sync_mode(SNTP_SYNC_MODE_IMMED); sntp_init(); setenv("TZ", "CST-8", 1); tzset();