Open LingaoM opened 1 week ago
if we mark it as deprecated, I would go one step further: make a bt_hci_cmd_send_async()
(with maybe a bt_hci_cmd_wait_rsp()
too) and let the application manage synchronisation itself.
if we mark it as deprecated, I would go one step further: make a
bt_hci_cmd_send_async()
(with maybe abt_hci_cmd_wait_rsp()
too) and let the application manage synchronisation itself.
What are the benefits of changing to asynchronous?
It opens new possibilities. The application can define a wrapper that restores the synchronous behavior, but it can also work in a more event-loop or cooperative manner.
I.e. you can fire off a VS command that starts a special advertiser or something and immediately start acquisition of a sensor value. You don't have to choose between more memory (for two threads) or slowness due to having to wait for the response all the time. This is especially useful for slow HCI links I suppose.
For usage in the stack, it allows making proper state machines instead of holding hostage the thread that called bt_init()
or the syswq. There are not many places where we have a bunch of synchronous calls in the host, but that's where the bugs have been in the past. Defining proper state machines instead of a nest of if () {send-sync-cmd()}
s will give us more readability and robustness.
A few doubts with asynchronous:
If change from sync to async, how to confirm whether the command is actually executed successfully, asynchronous notification by defining callback? Does such a design require the reconstruction of all designs of the current bluetooth host protocol stack also Mesh and LE Audio?
If change from sync to async, how to confirm whether the command is actually executed successfully, asynchronous notification by defining callback?
Yes, an async API has to give some kind of signal when the result is ready. This can be a callback, semaphore or other event.
Does such a design require the reconstruction of all designs of the current bluetooth host protocol stack also Mesh and LE Audio?
No, you can always transform the async version to sync by simply waiting for the result signal after starting the async operation. The async version is more flexible.
We have a sketch of a possible bt_hci_cmd_send_async
. There you can see that bt_hci_cmd_send_sync
is implemented using the async version.
if we mark it as deprecated, I would go one step further: make a
bt_hci_cmd_send_async()
(with maybe abt_hci_cmd_wait_rsp()
too) and let the application manage synchronisation itself.
Looks great, maybe we can do these two things together, which will make the code more readable.
We have a sketch of a possible
bt_hci_cmd_send_async
. There you can see thatbt_hci_cmd_send_sync
is implemented using the async version.
Converting to asynchronous may cause some behavioral inconsistencies:
If we call bt_hci_cmd_send_async
in sysworkq thread, but not call k_sem_take(&sync_sem, HCI_CMD_TIMEOUT)
, the sysworkq will be continue to running, maybe in other place, some code also call bt_hci_cmd_send_async
, then will be ASSERT there(https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/bluetooth/host/hci_core.c#L404).
In this condition, look like we must be bound to this sysworkq and wait must be called, but adding this constraint is strange for user development even for host & mesh, also reduce code readability.
So it looks like we can only remove this ASSERT, so there no constraint for upper developer.
But then, there will be a new problem. We assume that k_sem_take(&sync_sem, HCI_CMD_TIMEOUT)
is not used after the second call bt_hci_cmd_send_async
, and then the sysworkq continues to running, maybe in other place, some code also call bt_hci_cmd_send_async
again, this will enter K_FOREVER for allocate insufficient command buffer, so sysworkq will be blocked(even not blocked by call bt_hci_cmd_send_async
, other code also maybe enter block state).
In this case, the only way to avoid deadlock is not to call net_buf_unref(cmd_buf)
on the sysworkq thread, but such a constraint is very strange.
Is your enhancement proposal related to a problem? Please describe.
bt_hci_cmd_send_sync
borrow cmd buffer as response, maybe cause some ram overhead: https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/bluetooth/host/hci_core.c#L162Describe the solution you'd like Response should alloc by caller, not by API internal.
Describe alternatives you've considered See: https://github.com/zephyrproject-rtos/zephyr/pull/74600 For Zephyr API lifecycle, to avoiding cause breaking API changes, add new API:
bt_hci_cmd_send_sync_v2
, and oldbt_hci_cmd_send_sync
mark __deprecated.Additional context None