Azure / azure-iot-middleware-freertos

Azure IoT Middleware for FreeRTOS
https://azure.github.io/azure-iot-middleware-freertos/
MIT License
79 stars 25 forks source link

_ProcessLoop takes 2000ms when ulTimeoutMilliseconds < 2000 and nothing to process #294

Closed rtheil-growlink closed 1 year ago

rtheil-growlink commented 1 year ago

I'm noticing a strange pattern where when there's nothing for _ProcessLoop to process, it takes exactly 2000ms to run.

Quick overview of how I have things set up. Main loop task does the work to queue messages, then there's an azure task that runs constantly and does nothing but maintain Azure connectivity and constantly run _ProcessLoop when connected.

Here's the code in the azure task: image image

PROCESS_LOOP_TIMEOUT_MS is 500, but even when set to 0 the behavior is the same. Here's the log of a typical run when there's nothing to do. Note the difference is EXACTLY 2000ms, and usually is. image

When attempting to send a message in the main loop, I run _ProcessLoop a single time after _SendTelemetry to attempt to deliver the message right away. This allows me to get a failure if we suddenly become disconnected, and I can then store the message for sending later, or if the device happens to reboot or crash. image

This used to be set to PROCESS_LOOP_TIMEOUT_MS. Setting the timeout to 0 from 500 made a big difference in the amount of time spent on this task. In this case, it takes just 360ms, and is often lower. image

Here, I make a direct method call to the device (no work here, just returning a variable value). Because there's something to do, it takes just 1080ms, even with calling _SendCommandResponse. image

The reason this even matters is because during this 2 second period, any direct method call will not be processed and must wait until the next or subsequent iteration. A response can take upwards of 5 seconds or more. Any idea why when there's nothing for _ProcessLoop to do, it takes exactly 2 full seconds to finish, but when there is something to do it finishes quickly?

danewalton commented 1 year ago

Hi @rtheil-growlink I'm thinking it could be this from the coreMQTT docs:

https://github.com/FreeRTOS/coreMQTT/blob/19d198c8cb03c00307608b3008b62d433b6f4353/source/include/core_mqtt.h#L445-L461

Specifically, I think number 1 might be happening. We have a transport send/receive delay of 5000ms currently in our ADU sample:

https://github.com/Azure-Samples/iot-middleware-freertos-samples/blob/1f3be178a6bce3dd072a8e85580b11ab73765ae3/demos/sample_azure_iot_adu/sample_azure_iot_adu.c#L113

Can you try modifying that to see if a lower number gets you better performance?

danewalton commented 1 year ago

Hey @rtheil-growlink Any updates on this?

rtheil-growlink commented 1 year ago

Thanks for those updates, Dane. I'm going to look deeper into this when I have a chance. For now it's not a huge issue, but something I'd like to improve in future firmware versions. I'll close this for now and reopen in the future if needed.