Closed JanneKiiskila closed 4 years ago
Attn: @ARMmbed/team-ublox @MarceloSalazar
Log example below;
Counter 283
[INFO][COAP]: ======== Incoming CoAP package ========
[INFO][COAP]: Status: COAP_STATUS_OK
[INFO][COAP]: Code: COAP_MSG_CODE_EMPTY
[INFO][COAP]: Type: COAP_MSG_TYPE_ACKNOWLEDGEMENT
[INFO][COAP]: Id: 25
[INFO][COAP]: Payload len: 0
[INFO][COAP]: ======== End of CoAP package ========
[INFO][mClt]: M2MNsdlInterface::send_resource_observation - uri 3200/0/5501
[INFO][COAP]: ======== Outgoing CoAP package ========
[INFO][COAP]: Status: COAP_STATUS_OK
[INFO][COAP]: Code: COAP_MSG_CODE_RESPONSE_CONTENT
[INFO][COAP]: Type: COAP_MSG_TYPE_CONFIRMABLE
[INFO][COAP]: Id: 26
[INFO][COAP]: Token: 2d:a1:0f:0b
[INFO][COAP]: Content-type: 0
[INFO][COAP]: Payload len: 3
[INFO][COAP]: Max-age: 0
[INFO][COAP]: Observe: 247
[INFO][COAP]: ======== End of CoAP package ========
Counter 284
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 285
[INFO][COAP]: ======== Outgoing CoAP package ========
[INFO][COAP]: Status: COAP_STATUS_OK
[INFO][COAP]: Code: COAP_MSG_CODE_RESPONSE_CONTENT
[INFO][COAP]: Type: COAP_MSG_TYPE_CONFIRMABLE
[INFO][COAP]: Id: 26
[INFO][COAP]: Token: 2d:a1:0f:0b
[INFO][COAP]: Content-type: 0
[INFO][COAP]: Payload len: 3
[INFO][COAP]: Max-age: 0
[INFO][COAP]: Observe: 247
[INFO][COAP]: ======== End of CoAP package ========
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 286
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 287
[INFO][COAP]: ======== Outgoing CoAP package ========
[INFO][COAP]: Status: COAP_STATUS_OK
[INFO][COAP]: Code: COAP_MSG_CODE_RESPONSE_CONTENT
[INFO][COAP]: Type: COAP_MSG_TYPE_CONFIRMABLE
[INFO][COAP]: Id: 26
[INFO][COAP]: Token: 2d:a1:0f:0b
[INFO][COAP]: Content-type: 0
[INFO][COAP]: Payload len: 3
[INFO][COAP]: Max-age: 0
[INFO][COAP]: Observe: 247
[INFO][COAP]: ======== End of CoAP package ========
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 288
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 289
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 290
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 291
[INFO][COAP]: ======== Outgoing CoAP package ========
[INFO][COAP]: Status: COAP_STATUS_OK
[INFO][COAP]: Code: COAP_MSG_CODE_RESPONSE_CONTENT
[INFO][COAP]: Type: COAP_MSG_TYPE_CONFIRMABLE
[INFO][COAP]: Id: 26
[INFO][COAP]: Token: 2d:a1:0f:0b
[INFO][COAP]: Content-type: 0
[INFO][COAP]: Payload len: 3
[INFO][COAP]: Max-age: 0
[INFO][COAP]: Observe: 247
[INFO][COAP]: ======== End of CoAP package ========
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 292
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 293
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 294
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 295
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 296
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 297
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 298
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 299
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 300
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 301
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 302
[INFO][mClt]: M2MNsdlInterface::received_from_server_callback - message sending failed, id 26
[ERR ][mClt]: M2MInterfaceImpl::registration_error code [6]
[ERR ][mClt]: M2MInterfaceImpl::socket_error: (-8), retry (1), reconnecting (0), reconnection_state (1)
[INFO][mClt]: M2MInterfaceImpl::create_random_initial_reconnection_time() initial random time 34
[INFO][mClt]: M2MInterfaceImpl::socket_error - reconnecting in 34(s)
[ERR ][mClt]: ConnectorClient::error() - error: 6
[INFO][fcc ]: storage_pal_sst.c:602:storage_item_get_data:===> item name = mbed.UseBootstrap len = 17, data max size = 4
[INFO][fcc ]: storage_pal_sst.c:640:storage_item_get_data:<===
[ERR ][mClt]: ServiceClient::connector_error() error 6
[ERR ][mClt]: MbedCloudClient::error code (6)
client_error(6) -> Client in reconnection mode NetworkError
[ERR ][mClt]: ServiceClient::state_failure()
[INFO][mClt]: MbedCloudClient::complete status (-1)
I.e. the notification sending fails - client tries to reconnect, but it just does not help and connection never recovers. Counter was only 283 in this case, so it happened quite quicly that time (I had a total of 4 test runs over the weekend, every one of them failed).
For reference: NXP K66F worked flawlessly the whole weekend, even with the Ethernet cable intentionally plugged out for a while.
Same story with mbed-cloud-client-example, log from that, too.
Counter 283
[INFO][COAP]: ======== Incoming CoAP package ========
[INFO][COAP]: Status: COAP_STATUS_OK
[INFO][COAP]: Code: COAP_MSG_CODE_EMPTY
[INFO][COAP]: Type: COAP_MSG_TYPE_ACKNOWLEDGEMENT
[INFO][COAP]: Id: 25
[INFO][COAP]: Payload len: 0
[INFO][COAP]: ======== End of CoAP package ========
[INFO][mClt]: M2MNsdlInterface::send_resource_observation - uri 3200/0/5501
[INFO][COAP]: ======== Outgoing CoAP package ========
[INFO][COAP]: Status: COAP_STATUS_OK
[INFO][COAP]: Code: COAP_MSG_CODE_RESPONSE_CONTENT
[INFO][COAP]: Type: COAP_MSG_TYPE_CONFIRMABLE
[INFO][COAP]: Id: 26
[INFO][COAP]: Token: 2d:a1:0f:0b
[INFO][COAP]: Content-type: 0
[INFO][COAP]: Payload len: 3
[INFO][COAP]: Max-age: 0
[INFO][COAP]: Observe: 247
[INFO][COAP]: ======== End of CoAP package ========
Counter 284
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 285
[INFO][COAP]: ======== Outgoing CoAP package ========
[INFO][COAP]: Status: COAP_STATUS_OK
[INFO][COAP]: Code: COAP_MSG_CODE_RESPONSE_CONTENT
[INFO][COAP]: Type: COAP_MSG_TYPE_CONFIRMABLE
[INFO][COAP]: Id: 26
[INFO][COAP]: Token: 2d:a1:0f:0b
[INFO][COAP]: Content-type: 0
[INFO][COAP]: Payload len: 3
[INFO][COAP]: Max-age: 0
[INFO][COAP]: Observe: 247
[INFO][COAP]: ======== End of CoAP package ========
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 286
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 287
[INFO][COAP]: ======== Outgoing CoAP package ========
[INFO][COAP]: Status: COAP_STATUS_OK
[INFO][COAP]: Code: COAP_MSG_CODE_RESPONSE_CONTENT
[INFO][COAP]: Type: COAP_MSG_TYPE_CONFIRMABLE
[INFO][COAP]: Id: 26
[INFO][COAP]: Token: 2d:a1:0f:0b
[INFO][COAP]: Content-type: 0
[INFO][COAP]: Payload len: 3
[INFO][COAP]: Max-age: 0
[INFO][COAP]: Observe: 247
[INFO][COAP]: ======== End of CoAP package ========
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 288
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 289
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 290
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 291
[INFO][COAP]: ======== Outgoing CoAP package ========
[INFO][COAP]: Status: COAP_STATUS_OK
[INFO][COAP]: Code: COAP_MSG_CODE_RESPONSE_CONTENT
[INFO][COAP]: Type: COAP_MSG_TYPE_CONFIRMABLE
[INFO][COAP]: Id: 26
[INFO][COAP]: Token: 2d:a1:0f:0b
[INFO][COAP]: Content-type: 0
[INFO][COAP]: Payload len: 3
[INFO][COAP]: Max-age: 0
[INFO][COAP]: Observe: 247
[INFO][COAP]: ======== End of CoAP package ========
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 292
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 293
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 294
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 295
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 296
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 297
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 298
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 299
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 300
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 301
[INFO][mClt]: M2MNsdlInterface::observation_to_be_sent() - send already in progress
Counter 302
[INFO][mClt]: M2MNsdlInterface::received_from_server_callback - message sending failed, id 26
[ERR ][mClt]: M2MInterfaceImpl::registration_error code [6]
[ERR ][mClt]: M2MInterfaceImpl::socket_error: (-8), retry (1), reconnecting (0), reconnection_state (1)
[INFO][mClt]: M2MInterfaceImpl::create_random_initial_reconnection_time() initial random time 34
[INFO][mClt]: M2MInterfaceImpl::socket_error - reconnecting in 34(s)
[ERR ][mClt]: ConnectorClient::error() - error: 6
[INFO][fcc ]: storage_pal_sst.c:602:storage_item_get_data:===> item name = mbed.UseBootstrap len = 17, data max size = 4
[INFO][fcc ]: storage_pal_sst.c:640:storage_item_get_data:<===
[ERR ][mClt]: ServiceClient::connector_error() error 6
[ERR ][mClt]: MbedCloudClient::error code (6)
client_error(6) -> Client in reconnection mode NetworkError
Counter is at that same 283, so that seems like too much of a co-incidence?
Internal Jira reference: https://jira.arm.com/browse/MBOTRIAGE-2559
I have seen this issue too. Thank you @JanneKiiskila goes into the detail.
I actually have seen this issue on mebd-os-5.14.2 and it seems that ARP packet cannot be received after running for a while.
By connecting the ublox to the wifi hotspot on my laptop and use Wireshark for packet capture, I traced things back to the Ublox driver. However, since it's closed source, I cannot proceed anymore. The LWIP trace shows that Ublox requires ARP at 270 seconds (30 seconds before ARP table expires) and it usually gets a response to renew the table. However, after a while, the packet capture shows the gateway(my laptop) provided an ARP response but it wasn't received by the LWIP stack although SSDP packet was received without any issue. Hence, no packet can be sent.
I temporarily goes around the issue by enabling static ARP but would like to see a much more reliable ublox driver.
@ARMmbed/team-ublox - please comment / analyse this issue.
This issue is just seems duplicate of https://github.com/ARMmbed/mbed-os/issues/10815
we would resume analysis on it and would see how we may fix in driver.
for wifi we have analyzed and turning off DHCP and switching to static IP's avoid a crash.
Closing as it's a duplicate of https://github.com/ARMmbed/mbed-os/issues/10815
This is not a crash, this is a jam. So, I don't see the problem symptom being the same necessarily - there is no Mbed OS crash.
Reproducing this issue not require any noisy Wi-Fi environment either, just enough time (2-3 hours).
@JanneKiiskila thanks for clarifying this. I'm reopening for @ARMmbed/team-ublox to investigate if they can.
@JanneKiiskila Are you getting device running into JAM condition without DHCP if practice for 2-3 hours?
I have not tried with static IP addresses, it's not a realistic scenario for many setups.
Description of defect
Try running mbed-os-example-pelion with UBLOX_EVK_ODIN_W2 target, leave it running for a longer time. Once tick count on /3200/0/5501 reaches around 1900-2200 - the connection will fail and all DNS queries against that interface will fail. Seems the Wi-Fi stack gets somehow stuck and it never recovers.
With K66F/Ethernet - you can plug out the whole cable for a long time, then plug it back in and everything recovers nicely.
Target(s) affected by this defect ?
UBLOX_EVK_ODIN_W2
Toolchain(s) (name and version) displaying this defect ?
gcc-arm-none-eabi-9-2019-q4-major
What version of Mbed-os are you using (tag or sha) ?
mbed-os.5.15.1
What version(s) of tools are you using. List all that apply (E.g. mbed-cli)
mbed-cli 1.10.1
How is this defect reproduced ?
mbed_app.json
, optinally turn traces on.-> At some point of time you will have network error/lose Wi-Fi connectivity.