Closed trivialkettle closed 7 months ago
Hi @trivialkettle, I haven't really touched the MQTT-SN code for years and my in-depth understanding of it got a bit rusty over time. Besides, my current bandwidth is a bit limited to look into it in depth.
First of all, if I understood your description correctly you say you experience unexpected call to tick()
. If my understanding is correct, please make sure when the callback set by the cc_mqttsn_client_set_cancel_next_tick_wait_callback()
is called, the previously programmed timer is actually cancelled and only after the next invocation of the callback set by the cc_mqttsn_client_set_next_tick_program_callback()
will be reprogrammed again with the new value.
To help me to help you to analyze the problem please provide the logs with timestamps from your application code:
It will help me see the sequence of the API calls and I might be able to reproduce the problem in the unit-testing and then be able to look into it properly.
Hi @arobenko
You are right, there is an unexpected call to tick()
.
It looks like that the timer is canceled and the cancel command is "stuck" in the task queue because of other stuff that is done so that the timer timeouts and queues tick()
to the task queue.
This needs to be fixed on my side, thanks for your hint.
Hi, I use the client on a MCU and notice strange disconnects to the MQTTSN gateway. The error was always that the gateway disconnected, though I see in the traffic that PINGREQ and PINGRESP are always on time. The disconnects happened often together with a valid ping in the logs:
I used
retryPeriod = 5[s]
andretryCount = 1
.I set
retryCount = 3
,m_keepAlivePeriod=10_000
and added some logs to the BasicClient.h and found the following:The log format is:
The "normal" operation looks like this:
I see a lot calls to
updateTimestamp
, and than about 30ms latertick()
is called by my timer.tick()
callscheckTimeouts()
. It does not early exit becausem_timestamp
is not smaller thanm_nextTimeoutTimestamp
and finally callscheckPing()
.evaluates to
true
becausem_lastRecvMsgTimestamp (1322) + m_keepAlivePeriod(10_000) == m_timestamp(11322)
and a ping is send.As expected this happens 10s later again:
But sometimes this happens:
updateTimstamp
callscheckTimeouts
andm_timestamp == m_nextTimeoutTimestamp
so it does not exit and callscheckPing()
. There again is needs to ping, becausem_lastRecvMsgTimestamp
is 10s old. But thentick
is called andm_timestamp
is magically increased by 5000ms in just 4ms, som_timestamp == m_nextTimeoutTimestamp
again and the call tosendPing()
in the last line ofcheckPing()
is executed. So basically there are two pings in just 4ms, though the retry period is 5s. This lead to mygateway disconnect
errors.10s later "normal" operation:
then some later again double ping.
My workaround is to set
retryCount
to 3.Since I am on a MCU with pretty limited debug functionality I cannot really step into the code. I hope the logs help.
Best regards