torizon / meta-toradex-torizon

Torizon OS OpenEmbedded Distro Layer
MIT License
6 stars 13 forks source link

Error in initial call to auto-provisioning causes a 5 min delay before it takes effect #90

Closed EdTheBearded closed 3 months ago

EdTheBearded commented 3 months ago

Running the auto-provisioning.sh script results in an initial error. This triggers the 5 min “RestartSec” from the systemd service script causing a long delay before the devices are available in the platform. It would be nice if this didn’t happen and devices were more quickly adopted into the platform.

With additional logging added we can see what the issue is on initial attempt:

Apr 28 17:42:38 verdin-imx8mp-06849059 auto-provisioning.sh[817]: Starting auto-provisioning script
Apr 28 17:42:38 verdin-imx8mp-06849059 auto-provisioning.sh[817]: Checking provisioning status
Apr 28 17:42:38 verdin-imx8mp-06849059 auto-provisioning.sh[832]:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Apr 28 17:42:38 verdin-imx8mp-06849059 auto-provisioning.sh[832]:                                  Dload  Upload   Total   Spent    Left  Speed
Apr 28 17:42:39 verdin-imx8mp-06849059 auto-provisioning.sh[832]: [158B blob data]
Apr 28 17:42:39 verdin-imx8mp-06849059 auto-provisioning.sh[832]: curl: (60) SSL certificate problem: certificate is not yet valid
Apr 28 17:42:39 verdin-imx8mp-06849059 auto-provisioning.sh[832]: More details here: https://curl.se/docs/sslcerts.html
Apr 28 17:42:39 verdin-imx8mp-06849059 auto-provisioning.sh[832]: curl failed to verify the legitimacy of the server and therefore could not
Apr 28 17:42:39 verdin-imx8mp-06849059 auto-provisioning.sh[832]: establish a secure connection to it. To learn more about this situation and
Apr 28 17:42:39 verdin-imx8mp-06849059 auto-provisioning.sh[832]: how to fix it, please visit the web page mentioned above.

Seems like the curl SSL certificates are not valid yet when we try to get the access token from the server. Since trying again shortly after boot works, it seems to be some kind of boot initialization/timing issue where we try to use the curl before the certificates are “ready”. If the issue is timing based it may not occur on every type of SOM we have.

This has been easily reproduced on Verdin iMX8M Mini (Drew) and Plus (Jeremias and our customer Fline). See Fline report on https://toradex.slack.com/archives/C06MZRUG70F/p1715337847826789. Here’s also a background on why the timeout is 5 minutes https://toradex.atlassian.net/browse/TOR-2271. Also, we suggest customer edit the timeout https://developer.toradex.com/torizon/torizoncore/production-programming-in-torizon/#check-and-customize-auto-provisioning-process.