Azure / iotedge

The IoT Edge OSS project
MIT License
1.46k stars 458 forks source link

aziot-edged: Only reprovision on Identity Service errors #7246

Closed gordonwang0 closed 6 months ago

gordonwang0 commented 6 months ago

During startup, if aziot-edged encounters an error when obtaining the device identity, it will attempt to reprovision the device. Sometimes, if aziot-identityd has not fully started yet, aziot-edged will receive an OS "Connection refused" or similar error and attempt to reprovision.

Normally, the extra reprovision doesn't matter if the device is online. But in offline scenarios, attempting to reprovision will fail and put the device in a bad state.

This PR modifies the startup of aziot-edged so that only errors returned by Identity Service cause a reprovision. OS errors such as connection refused, permissions, etc. are ignored because they generally mean that Identity Service has not yet fully started.

damonbarry commented 6 months ago

/azp run

azure-pipelines[bot] commented 6 months ago
Azure Pipelines successfully started running 4 pipeline(s).
varunpuranik commented 6 months ago

This PR modifies the startup of aziot-edged so that only errors returned by Identity Service cause a reprovision. OS errors such as connection refused, permissions, etc. are ignored because they generally mean that Identity Service has not yet fully started.

Are we confident that this is always the case? What if there are scenarios where errors, not from Identity Service, also require re-provisioning? Will this change cause a regression in that case?

@veyalla fyi..

gordonwang0 commented 6 months ago

I don't think there are any errors not from Identity Service that should trigger reprovisioning. The two errors that should trigger reprovisioning are DeviceNotFound (unprovisioned) and KeyClient (failure to load device key). Anything else is not Identity Service related, so it shouldn't trigger reprovisioning.