Open alwa-nordic opened 3 weeks ago
The most common pattern is to register the callback before starting Bluetooth, right? Are there use cases where this is not the case? If not, I suggest documenting it.
Is this an issue specifically for the conn callbacks, or for all callbacks?
I'm not opposed to removing the unregister()
API, however at least the concern where linked list nodes may get removed in the middle of iterating the list should be covered by the existence of SYS_SLIST_FOR_EACH_NODE_SAFE()
(which is specifically designed for such a use case).
I'm not opposed to removing the
unregister()
API, however at least the concern where linked list nodes may get removed in the middle of iterating the list should be covered by the existence ofSYS_SLIST_FOR_EACH_NODE_SAFE()
(which is specifically designed for such a use case).
Unfortunately SYS_SLIST_FOR_EACH_NODE_SAFE
only makes it safe to delete the one item that the iteration is currently at. It's not safe to perform an arbitrary amount of deletions.
It is possible to make a custom iterator-aware list implementation that is safe.
Is this an issue specifically for the conn callbacks, or for all callbacks?
This is probably a general issue. But I think it's best to consider cases one by one, as the situation may be slightly different each time, and the problem at hand is more tangible.
I think it's best to limit the scope of a single GitHub ticket to focus and make our product better a little at a time.
The most common pattern is to register the callback before starting Bluetooth, right? Are there use cases where this is not the case? If not, I suggest documenting it.
Yes, can't think of a good case. This amount of flexibility is only truly needed for exotic stuff like for dynamically loading an arbitrary number of libraries runtime that need Bluetooth callbacks.
How should we phrase this in the documentation? As a recommendation? Or as a requirement?
bt_conn_cb_register
: It's recommended to do all callback registration before the first call to bt_enable
.bt_conn_cb_unregister
: Using this is recommended against.Is this an issue specifically for the conn callbacks, or for all callbacks?
This is probably a general issue. But I think it's best to consider cases one by one, as the situation may be slightly different each time, and the problem at hand is more tangible.
I think it's best to limit the scope of a single GitHub ticket to focus and make our product better a little at a time.
The reason why I asked is that we have multiple cases where we register/unregister callbacks in the LE audio stack and samples.
We may want to scan for something specific and do
Step 3 can of course be replaced by a local boolean where we just return early in the scan receive callback.
Another use case we have for unregister (and the reason why we recently added some unregister in the LE audio API), is that when running unittests, we want to register callbacks at the beginning of the test, and remove it again at the end. The reason for this is that memory isn't cleared between each run, so we need to clear it manually. There may be some workarounds that can make this work, but there certainly are some use cases for unregister.
@alwa-nordic Even though this could be classified as a bug, it seems to be more of a general "we need to improve this area" kind of thing. So I will retag it as enhancement. So it follows a different triaging path. If you guys disagree, you are welcome to tag it back to bug and set a priority.
Ok. At least enhancement seems consistent with other issues related to thread-safety: https://github.com/zephyrproject-rtos/zephyr/issues/52364
We should evaluate whether these issues should be counted as bugs or not in the Bluetooth WG meeting. I'll add the tag.
Another thing to keep in mind is that an application can also at any point set/unset the function pointers in struct bt_conn_cb
. This is another way for an application to avoid getting callbacks when it's not interested (similar to the bool
approach mentioned above). In this case the list itself is not modified, but the pointers inside each list node may be
bt_conn_cb_register
andbt_conn_cb_unregister
are not thread safe.Fixing this is non-trivial because of
bt_conn_cb_unregister
. The following must be taken into consideration:bt_conn_cb_unregister
could happen.bt_conn_cb_unregister
returns, the callback is not running or going to be run. It is always a plus to adhere to the principle of least astonishment.It may be necessary to alter the API. Either by limiting in documentation when
bt_conn_cb_unregister
may be called, or better by changing the API to naturally synchronize. Alternatively, consider removingbt_conn_cb_unregister
, as the use-case it was added for is poorly motivated[^1].[^1]: The simpler fix would have been to call
bt_conn_cb_register
once on boot from SYS_INIT.