Open pekkanikander opened 5 years ago
Hi @pekkanikander,
Have you tried to reproduce this issue without any other libraries in the sketch? Ideally if we had something that just used the MKRNB
lib it would be easier to isolate.
Would you also be able to share a debug log?
Sorry, I've pretty much run out of time with this. (We had https://iothon.io, where we used MKR NB 1500 boards. But that is over now.)
What we managed to do is a working SWD interface. However, I didn't have time to get gdb working on my Mac OS X. If you can point out instructions to that, I can try to take a few stack traces still.
Below some debug logs with our MKRNB fork.
16:33:27.408 -> '
16:33:27.442 -> MODEM: poll: resp='+USORD: 0,""\013
16:33:27.442 -> \013
16:33:27.442 -> OK\013
16:33:27.442 -> '
16:33:27.442 -> MODEM: send: data='AT+USORD=0,512'
16:33:27.442 -> MODEM: poll: at='AT+USORD=0,512\013\013
16:33:27.442 -> '
16:33:27.442 -> MODEM: poll: resp='+USORD: 0,""\013
16:33:27.442 -> \013
16:33:27.442 -> OK\013
16:33:27.442 -> '
16:33:28.462 -> MODEM: send: data='AT+USORD=0,512'
16:33:28.462 -> MODEM: poll: at='AT+USORD=0,512\013\013
16:33:28.462 -> '
16:33:28.462 -> MODEM: poll: resp='+CME ERROR: Operation not allowed\013
16:33:28.462 -> '
16:33:28.462 -> NBClient: Closing the socket (may take several minutes)...
16:33:28.462 -> MODEM: poll: urc='\013
16:33:28.462 -> '
16:33:28.497 -> MODEM: send: data='AT+USOCL=0'
16:33:28.497 -> MODEM: poll: at='AT+USOCL=0\013\013
16:33:28.497 -> '
16:33:28.497 -> MODEM: poll: resp='+CME ERROR: Operation not allowed\013
16:33:28.497 -> '
16:33:28.497 -> NBClient: Closing the socket: closed.
---
17:13:36.239 -> NBClient: Closing the socket (may take several minutes)...
17:13:36.276 -> MODEM: send: data='AT+USOCL=0'
17:15:31.648 -> MODEM: poll: at='AT+USOCL=0\013\013
17:15:31.648 -> '
17:15:31.648 -> MODEM: poll: resp='OK\013
17:15:31.648 -> '
17:15:31.648 -> NBClient: Closing the socket: closed.
17:15:31.648 -> Connecting to the MQTT server...\013
17:15:31.648 -> MODEM: send: data='AT+USOCR=6'
---
MODEM: send: data='AT+USORD=0,512'
MODEM: poll: at='AT+USORD=0,512
'
MODEM: poll: resp='+CME ERROR: Operation not allowed
'
NBClient: Closing the socket (may take several minutes)...
MODEM: poll: urc='
'
MODEM: send: data='AT+USOCL=0'
MODEM: poll: at='AT+USOCL=0
'
MODEM: poll: resp='+CME ERROR: Operation not allowed
'
NBClient: Closing the socket: closed.
The middle trace is a rare one. There the stack managed to continue opening a new socket before crashing. In all other cases observed the Arduino crashed shortly after the socket got closed.
Hi @pekkanikander,
I managed to find a MQTT broker I could start/stop/kill and tried your example sketch with MKRNB v1.3.0 (released today) and PubSubClient v2.7.0. All looks good after I stop mosquito or kill -9 the mosquito process, I've attach the output from the serial monitor for both scenarios.
We have some networking problems with out test 5G network, basically causing the network to close TCP connections more or less randomly. This causes the MKR NB 1500 to completely crash, including the USB host connection, until reset. As we don't have any SWD debugging capability (as of yet), it is very hard to pinpoint the real cause.
Setup
Libraries:
PubSubClient
,MKRNB
Sketch: https://github.com/iothon/docker-compose-mqtt-influxdb-grafana/tree/master/03-arduino_mqtt/mkr_nb_1500_mqttTo repeat
Have a NB-IoT SIM card, modify and flash the sketch (from above), wait until the sketch connects to your MQTT server.
Cause a network side TCP connection termination, e.g. pull the Ethernet from your server,
kill
your Mosquitto, or something similar. Sometimes a regular TCP close also seems to cause the same problem.Symptoms
In
NBClient.cpp
, if theAT+USORD=0,512
call, used to receive incoming packets, returns+CEM ERROR: ...
, this causes the NBClient to close the socket, due toPubSubClient
calling_client->stop()
.Some time after the
NBClient::stop()
method has returned, the sketch crashes so badly that the virtual TTY is no longer visible through USB. The board seems to stay in that stage until it is reset with the reset button, or if it is disconnected, it runs out of battery.