Open joerchan opened 3 years ago
Potential solution:
CONFIG_BT_CTLR_RX_BUFFERS
Alternative:
I can confirm that this issue is present for hci_uart. The problem is that this sample is based on the 'interrupt driven' UART API, which does not allow partial readout of the UART.
One solution for UART is to switch over to the 'asynchronous' UART API. This will be done in a new sample: https://github.com/zephyrproject-rtos/zephyr/pull/59880
IMO we should just bite the bullet and make a dedicates Host Number of Completed Packets command pool. It should be a good-enough stopgap measure until the general flow control issue is properly addressed.
This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.
This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.
This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.
Describe the bug A few controller-side HCI drivers (see below) are allocating command buffers using K_NO_WAIT from interrupt context. This was done assuming that the Host to Controller flow control of commands (num complete in command complete / status) would limit the number of commands that the host can have queued up in the controller. However the Host Number of Completed packets (Controller to Host flow control) does not follow the regular flow control and can be sent from the host at any time. There is no command-complete / status event for this command.
In cases where the controller does not process the incoming commands in time before the command buffer pool is drained the command will be dropped by the controller side HCI driver.
Affected drivers are: HCI UART HCI RPMsg HCI SPI USB Bluetooth class
To Reproduce Reproduced by robustness test. Test was doing SMP pairing procedure immediately upon connection, causing multiple packets to be sent at the same time as control procedures are being run. Controller was delayed in processing of commands by ECDH HCI commands.
Expected behavior Controller HCI drivers should handle Host Number of Completed Packets outside of the regular flow control mechanism.
Impact Controller HCI driver drops HCI commands. If this is a Host Number of Completed Packets and will "leak" packet buffer count that it can send to the host. If this is any other HCI command that has flow control the Host side will not received HCI command complete / status and the Host will timeout waiting for the reply.
Logs and console output Host side:
Controller side:
Environment (please complete the following information):
Additional context Workaround: Increase HCI command buffer count
CONFIG_BT_BUF_CMD_TX_COUNT
from its default of 2 to 10. Workaround added here: https://github.com/zephyrproject-rtos/zephyr/pull/37714 and here: https://github.com/zephyrproject-rtos/zephyr/pull/37759 This would mask the error but is not a proper solution.