zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.12k stars 6.21k forks source link

nRF5340 network core booted twice for bluetooth applications #72725

Open andvib opened 2 months ago

andvib commented 2 months ago

Describe the bug

For nRF5340, the network core will be booted twice for Bluetooth applications. Verified on nRF5340dk_nrf5340 and nrf5340_audio_dk_nrf5340. The double boot happens as the nrf5340_cpunet_reset.c in the board folder will directly enable the network core, and the drivers/bluetoooth/hci/nrf53_support.c will enable it again using nrf53_cpunet_mgmt.

To Reproduce

  1. Build and flash the hci_ipc sample for nrf5340dk_nrf5340_cpunet.
  2. In boards/nordic/nrf5340dk/nrf5340_cpunet_reset.c introduce a small sleep after the network core has been enabled (end of remoteproc_mgr_boot()). This is to simulate an application where there are more systems initializing during boot time, as the bug is not seen with the peripheral_hr sample otherwise.
  3. Build and flash the peripheral_hr sample for nrf5340dk_nrf5340_cpuapp
  4. The boot log for the network core will be printed twice as the network core is booted twice.

Expected behavior Network core is only booted once for Bluetooth applications.

Impact This causes an issue in the audio application in NCSDK, as the RTC references will be different between the application and network core when the network core is booted twice.

Logs and console output

*** Booting Zephyr OS build v3.6.0-3857-g84b8e92445f1 ***
*** Booting Zephyr OS build v3.6.0-3857-g84b8e92445f1 ***

Environment (please complete the following information):

github-actions[bot] commented 2 months ago

Hi @andvib! We appreciate you submitting your first issue for our open-source project. 🌟

Even though I'm a bot, I can assure you that the whole community is genuinely grateful for your time and effort. 🤖💙

jciupis commented 2 months ago

The assumption behind combined changes introduced in https://github.com/zephyrproject-rtos/zephyr/pull/71337 and https://github.com/zephyrproject-rtos/zephyr/pull/72412 was that it is fine to let calls to nrfx that drive the Force-Off signal:

nrf_reset_network_force_off(NRF_RESET, false);

be executed multiple times in a row. The assumption was that only the first call changes the network CPU's state and all the others that follow have no effect. This seemed to have been confirmed by various Bluetooth samples. The issue description confirms that as well.

However, what this bug report shows is that this assumption is only correct if the time between subsequent calls is short enough. Otherwise, the second call causes the network CPU to reboot.

In my opinion, the best way to fix this bug given all the requirements for network CPU management that made it necessary in the first place is to no longer boot the network core in board initialization files. Instead, the network CPU users such as HCI driver or 802.15.4 driver should request it using the network CPU management API. That's a clean solution that I would have already incorporated into https://github.com/zephyrproject-rtos/zephyr/pull/72412 if not for the risk that there are components, which implicitly depend on the board initialization code to boot the network core.

However, since it's no longer a matter of clean division of responsibilities between modules but rather a functional problem that needs to be solved, I think we should remove network CPU control from board initialization file. Instead, we should introduce a requirement for the network CPU users to request and release it explicitly, similarly to how the HCI driver does it now.

andvib commented 2 months ago

@koffes

github-actions[bot] commented 22 hours ago

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.