Azure / iotedge

The IoT Edge OSS project
MIT License
1.46k stars 460 forks source link

Incorrect edgeAgent_total_time_running_correctly_seconds metric #7115

Closed tds-captic closed 1 year ago

tds-captic commented 1 year ago

Expected Behavior

Following this guide led me to notice a potential bug while developing. I still had a few modules that were failing: Screenshot 2023-09-23 at 23 14 22

But strangely enough, when looking at the Workbook I saw inconsistent results related to the uptime of these modules: Screenshot 2023-09-23 at 23 17 49

At least a few of these should be close to 0.

This is the output of the iotedge list command on the IoT device itself: Screenshot 2023-09-23 at 23 22 52

The logs confirm the fact that

Current Behavior

The edgeAgent_total_time_running_correctly_seconds is not correct. It does not reflect the amount of time the module was specified in the deployment and was in the running state.

Device Information

Runtime Versions

yophilav commented 1 year ago

Hi @tds-captic , Thank you for the issue report. In order for us to investigate this issue, we will need to request the logs of your systems. Could you please contact Microsoft support team to have them create an incident so we may exchange the logs securely please?

In the meantime, you can try to see what metrics we are getting from edgeAgent directly (without going through cloud/IoTHub) by doing

  1. Make sure the port 9600 is exposed (see the config in the first paragraph of https://learn.microsoft.com/en-us/azure/iot-edge/how-to-access-built-in-metrics?view=iotedge-1.4 )
  2. Once the port is exposed, you can run sudo docker exec edgeAgent wget -qO- edgeAgent:9600/metrics on the device which should output ALL the metrics and their values from edgeAgent.
simondegheselle commented 1 year ago

I am facing the same issue 😔

yophilav commented 1 year ago

Could you try expose the port 9600 and scrape the metrics per step 1 & 2 from the comment above? How do those metrics looking?

yophilav commented 1 year ago

@tds-captic Did you get a chance to try scrap and look at the edgeAgent metrics ?

yophilav commented 1 year ago

Close the issue due to inactivity.