zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.6k stars 6.49k forks source link

Error on lwm2m update after hibernate when using HL7800 modem #63634

Closed jbr-ia closed 9 months ago

jbr-ia commented 12 months ago

Describe the bug I try to use an HL7800 modem on a pinnacle-board (pinnacle-100-dvk) with lwm2m, NBiot and low power options enabled. If the modem goes to Hibernate and it wakes up, than the lwm2m-connection can not be updated (caused by a timeout to connect to an UDP-socket) and a new full lwm2m-registration is carried out. This was seen with firmware 4.6.9.4 of the modem. I also tried with a newer firmware-version (4.7.1.1). With that version, even the initial lwm2m-registration fails.

When looking further at this, I saw that after the modem went to hibernate, and wakes up again, the driver calls reconfigure_IP_connection https://github.com/zephyrproject-rtos/zephyr/blob/96175fcc476dc0a19a870d7da9e893f9f548e06f/drivers/modem/hl7800.c#L5749 In that function an AT+KUDPCFG? is send. In the send_at_cmd-code there is no sock yet, so for the timeout, response_sem is used. https://github.com/zephyrproject-rtos/zephyr/blob/96175fcc476dc0a19a870d7da9e893f9f548e06f/drivers/modem/hl7800.c#L990

Then, the modem sends a response with+KUDPCFG containing a socket that already exists on the modem and the function on_cmd_sockcreate is called. https://github.com/zephyrproject-rtos/zephyr/blob/96175fcc476dc0a19a870d7da9e893f9f548e06f/drivers/modem/hl7800.c#L3904 In that function last_socket_id is changed.

When zephyr than receives the 'OK' from the modem, the driver has a sock, and sock_send_sem is given. https://github.com/zephyrproject-rtos/zephyr/blob/96175fcc476dc0a19a870d7da9e893f9f548e06f/drivers/modem/hl7800.c#L3683 Because send_at_cmd-code waits for the response_sem, but the semaphore sock_send_sem is given by the ok, send_at_cmd errors with a timeout (after some time), and the lwm2m-engine gives the error ‘Cannot connect to UDP’.

It seems that with the newer modem firmware, (4.7.1.1), also after a reset the modem remembers the udp-socket, and also the first AT+KUDPCFG? in booting already returns an socket, leading to problems with taking a different semaphore than is given later.

To Reproduce To reproduce, I have modified the LwM2M-sample to use the pinnacle-modem, with lowpoweroptions enabled and wait in the main-program for the modem to be connected to the network before attempting to start the lwm2m-registration. Also the prj.conf is changed, so the lowpoweroptions are enabled, and some extra logging is added

  1. The changed sample is at: https://github.com/zephyrproject-rtos/zephyr/compare/main...jbr-ia:zephyr:lwm2m_pinnacle_test_modem
  2. Build the code west build -b pinnacle_100_dvk ./samples/net/lwm2m_client/ -- -DOVERLAY_CONFIG="overlay-queue.conf;overlay-lwm2m-1.1.conf"
  3. Flash it to the pinnacle west flash
  4. Run it on the pinnacle while connected to the serial interface.
  5. Wait until the modem is connected to the network, the lwm2m-registration was successfull, and some time after that the modem is hibernated.
  6. When the modem is hibernated, enter lwm2m update. Not the lwm2m-stack fails to update, and does a full re-registration

Expected behavior I expect that after the modem had hibernated, that it is possible to update the lwm2m-registration without a full re-registration.

Impact With the 4.6.9.4 firmware of the modem a full registration when an update was only needed. With 4.7.1.1 of the modem-firmware even not able to do an lwm2m registration.

Logs and console output

uart:~$ lwm2m update
[00:14:32.302,459] <dbg> net_lwm2m_rd_client: lwm2m_rd_client_service: State: 10
[00:14:32.302,490] <dbg> net_lwm2m_client_app: rd_client_event: Registration update
[00:14:32.302,520] <dbg> net_lwm2m_rd_client: lwm2m_rd_client_service: State: 11
[00:14:32.302,551] <dbg> net_lwm2m_engine: lwm2m_engine_connection_resume: Close and resume a new connection
[00:14:32.302,612] <dbg> modem_hl7800: allow_sleep: Keep awake
[00:14:32.302,612] <dbg> modem_hl7800: wakeup_hl7800: Waiting to wakeup
[00:14:32.303,375] <dbg> modem_hl7800: allow_sleep: Keep awake
[00:14:34.084,136] <dbg> modem_hl7800: hl7800_rx: HANDLE +KUDP_IND:  (len:3)
[00:14:34.084,197] <dbg> modem_hl7800: on_cmd_sock_ind: +KUDP_IND ID: 1
[00:14:34.085,174] <dbg> modem_hl7800: hl7800_rx: HANDLE +KSUP:  (len:1)
[00:14:34.085,235] <inf> modem_hl7800: Startup State: READY
[00:14:34.085,266] <inf> modem_hl7800: Sleep State: AWAKE
[00:14:34.085,388] <dbg> modem_hl7800: send_at_cmd: OUT: [AT+KCNXCFG=1,"GPRS","",,,"IPV4"]
[00:14:34.088,226] <dbg> modem_hl7800: send_at_cmd: sem_take, response_sem
[00:14:34.099,884] <dbg> modem_hl7800: hl7800_rx: HANDLE OK (len:0)
[00:14:34.099,914] <dbg> modem_hl7800: on_cmd_sockok: sem_give, response_sem
[00:14:34.100,006] <dbg> modem_hl7800: send_at_cmd: OUT: [AT+KTCPCFG?]
[00:14:34.101,013] <dbg> modem_hl7800: send_at_cmd: sem_take, response_sem
[00:14:34.105,895] <dbg> modem_hl7800: hl7800_rx: HANDLE OK (len:0)
[00:14:34.105,926] <dbg> modem_hl7800: on_cmd_sockok: sem_give, response_sem
[00:14:34.106,018] <dbg> modem_hl7800: send_at_cmd: OUT: [AT+KUDPCFG?]
[00:14:34.107,055] <dbg> modem_hl7800: send_at_cmd: sem_take, response_sem
[00:14:34.115,600] <dbg> modem_hl7800: hl7800_rx: HANDLE +KUDPCFG:  (len:23)
[00:14:34.115,631] <dbg> modem_hl7800: on_cmd_sockcreate: +KUDPCFG: 1
[00:14:34.115,661] <dbg> modem_hl7800: on_cmd_sockcreate: last socket id. Before: 0, after: 1
[00:14:34.116,119] <dbg> modem_hl7800: hl7800_rx: HANDLE OK (len:0)
[00:14:34.116,119] <dbg> modem_hl7800: on_cmd_sockok: sem_give, sock_send_sem
[00:14:40.107,208] <err> modem_hl7800: Invalid socket_id(-1) for net_ctx:0x20006534!
[00:14:40.107,269] <err> net_lwm2m_engine: Cannot connect UDP (-22)
[00:14:40.107,330] <dbg> modem_hl7800: allow_sleep: Keep awake
[00:14:40.107,391] <dbg> modem_hl7800: send_at_cmd: OUT: [AT+KUDPCLOSE=-1]
[00:14:40.108,825] <dbg> modem_hl7800: send_at_cmd: sem_take, sock_send_sem
[00:14:40.116,088] <dbg> modem_hl7800: hl7800_rx: HANDLE +CME ERROR:  (len:3)
[00:14:40.116,149] <err> modem_hl7800: Error code: 916
[00:14:46.108,947] <err> net_lwm2m_rd_client: Failed to update registration. Falling back to full registration

Environment (please complete the following information):

rerickson1 commented 11 months ago

Thanks for filing this. I have a PR that I need to get started that fixes some sleep issues that will likely fix this.

github-actions[bot] commented 9 months ago

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.