Open jori-nordic opened 5 months ago
Would you please add steps to reproduce it?
Would you please add steps to reproduce it?
I believe this refers to cases like bt_conn_unref(NULL)
(in application code) evaluating to undefined behavior (UB) when asserts are off. The question is whether it's harmful to allow UB as a result of API misuse when asserts are off, in public API, or even internal API.
The argument is that developer time saved by catching these errors is well worth the extra bytes and cycles many times over. I agree. My opinion is that simple asserts should be on-by-default globally in Zephyr, and we maintainers should make sure assert expressions are not too expensive. Disabling asserts should be a this-train-needs-no-brakes move by those who are forced into that particular requirement and are ready to bear the cost of harder debugging.
Just some comments to take along:
CHECKIF
that can be used as an assertion check or a runtime check (enabled by default with USERSPACE
).Just some comments to take along:
Assertions in most languages are no-ops when disabled, and crash/abort if a predicate is false, so we shouldn't depend on it in production code.
If a public API has documented assumptions, developers should adhere to them, and add checks before calling the functions, assertions in the callee code are there to help developers point out invalid calls.
There is also
CHECKIF
that can be used as an assertion check or a runtime check (enabled by default withUSERSPACE
).
I agree with you on all points. I think the issue here is: What should we recommend to our users? Asserts always on? Or asserts off in production? A default config is a pretty strong recommendation, so we should give it some consideration.
I'm of the opinion that asserts enabled in production is a good thing by default. And those who have special needs can make the choice for themselves. So I wish we could turn the tide and enable asserts by default for all of Zephyr.
Some data for @Thalley . I was expecting more asserts than that 😅
The point is, maybe we should have a IF_RETURN(condition, message, retval)
macro for the common pattern of checking parameters/state, logging an error and returning some value. Would de-clutter the host a bit I think.
null de-references: https://github.com/zephyrproject-rtos/zephyr/blob/243783243cbde53b5353d645c2099eaf48cf8083/subsys/bluetooth/host/gatt.c#L3469-L3486
ACL Control flow broken https://github.com/zephyrproject-rtos/zephyr/blob/243783243cbde53b5353d645c2099eaf48cf8083/subsys/bluetooth/host/hci_core.c#L574-L579
Bonding that falls on the floor if there's an error https://github.com/zephyrproject-rtos/zephyr/blob/243783243cbde53b5353d645c2099eaf48cf8083/subsys/bluetooth/host/smp.c#L3863-L3866
Bonding that falls on the floor if there's an error
This is an actual assertion. I assert that it's impossible for it to fail. You can read 'assert' as 'theorem' here.
Some data for @Thalley . I was expecting more asserts than that 😅
The point is, maybe we should have a
IF_RETURN(condition, message, retval)
macro for the common pattern of checking parameters/state, logging an error and returning some value. Would de-clutter the host a bit I think.
Hmm, we could possibly do that. However in https://docs.zephyrproject.org/latest/contribute/guidelines.html#coding-style we state
In general, follow the Linux kernel coding style
Which in return in https://www.kernel.org/doc/html/v4.10/process/coding-style.html#macros-enums-and-rtl states
Things to avoid when using macros:
macros that affect control flow:
define FOO(x) \
do { \ if (blah(x) < 0) \ return -EBUGGERED; \ } while (0)
And I generally agree with that. Implicit return statements (via macros or other ways) really messes with the readers flow.
So if you want to propose a macro with a return, be prepared for some pushback on that (and not just from me ;) )
DTS macrobatics are kosher though, definitely don't mess with people's heads, writing a driver is E-Z. I would also argue that an ifdef soup like we seem to love qualifies as a "macro that messes with the control flow". Sigh.. sorry for the deranged stream of thought, I'm just frustrated we're stuck with this language :|
In principle, I also agree that macros shouldn't mess with control flow.
Doing it correctly will end up with having to scroll through one page of error checking prologue for each API, all with the same if (bad) { log-something; return -EFIGURETHISOUT }
copy-pasted all over. When I'll get to the fn logic, I'm sure I will already have forgotten why I was looking at that fn in the first place..
I really don't know what to do.
Sigh.. sorry for the deranged stream of thought, I'm just frustrated we're stuck with this language :|
It was going so well until you and @alwa-nordic started coming here with all your rationale and logic. It was so much easier when no-one questioned why we do what we do ;)
To add some perspective, we tried to write an H4 HCI packet decoder in rust with @alwa-nordic and it was even more verbose than the C version :sweat_smile: . So not sure the grass is greener on that side either.
To add some perspective, we tried to write an H4 HCI packet decoder in rust with @alwa-nordic and it was even more verbose than the C version 😅 . So not sure the grass is greener on that side either.
This does not represent my opinion. /disclaimer :slightly_smiling_face:
Doing it correctly will end up with having to scroll through one page of error checking prologue for each API, all with the same
if (bad) { log-something; return -EFIGURETHISOUT }
copy-pasted all over. When I'll get to the fn logic, I'm sure I will already have forgotten why I was looking at that fn in the first place..
Maybe this could be solved be adding folding marks around the obvious stuff. :thinking:
The Bluetooth host stack doesn't implement enough error handling. A lot of parameter and state checking happens using
__ASSERT()
macros.This is a good thing, as it reduces the mental load of parsing error handling. That "error handling" usually results in a broken state somewhere else, leaks, etc..
So we need some sort of
ASSERT()
macro that is always enabled.Two solutions:
__ASSERT()
macros toBT_ASSERT()
select ASSERT
when Bluetooth is enabledNB: in the future, that macro could be redirected to allow graceful shutdown of the Bluetooth thread, if running in userspace. That way an error condition in the stack will no longer bring down the entire kernel.