Weak symbols have two functions: to permit overriding built-in definitions with custom definitions, and to permit calling a function
even if it is not defined.
The former usage is useful for __assert_fail; it allows users to safely override the definition. Unfortunately, the latter usage is actively dangerous: when calling a function, a weak reference is emitted, and if the particular linking order comes out differently than expected, the weakly-referenced symbol may be omitted from the link entirely. In this case, a NOP is inserted instead of a function call.
The dangerous impact of inserting a NOP in the place of __assert_fail is that it is also marked as noreturn. This means that the compiler will have generated code assuming that it would never return, and placed arbitrary other code after this point in the binary output. This silently prevents assertions from triggering (which may not be immediately noticeable), and (even worse) causes the code that fails those asserts to jump into random (and possibly unrelated) other pieces of code.
In order to solve this, we simply need to remove the weak attribute from the declaration of __assert_fail in the header file. The source file can safely continue to carry this attribute, because that will only affect the definition (allowing it to be overridden) without also permitting it to go entirely undefined.
Type of change
Please delete options that are not relevant.
[x] Bug fix (non-breaking change which fixes an issue)
How Has This Been Tested?
I inserted an assert(false); into a critical piece of code in an application. I executed the application, including that piece of code, and used GDB to demonstrate that a NOP was executed instead of a jump to __assert_fail, and that this caused execution to proceed into the code for an unrelated function. I inspected the linked application binary and object files to determine the cause, and noted that the NOP was present in the linked binary, but not in the original object file for the function in question.
After making this change, I re-ran the same tests, and confirmed that the assertion was triggered successfully and the expected assertion error message was reported.
Test Configuration:
Project version/commit: embeddedartistry/libc version 10242acf9e9c2c93e0fbfe7dc91314c926513cef
Hardware: QEMU '-M virt' platform
Toolchain: debian gcc-arm-none-eabi package, version 15:7-2018-q2-6 and debian binutils-arm-none-eabi package, version 2.31.1-12+11
SDK: custom independent build environment not incorporating any embeddedartistry/libc build scripts
Checklist:
[x] My code follows the style guidelines of this project
[x] I have performed a self-review of my own code
[ ] I have commented my code, particularly in hard-to-understand areas -- not relevant
[ ] I have made corresponding changes to the documentation -- not relevant
[x] My changes generate no new warnings
[ ] I have added tests that prove my fix is effective or that my feature works -- it would require some relative advanced testing infrastructure to validate this case; I would rather not implement the test for this bug if possible
[x] New and existing unit tests pass locally with my changes
[ ] Any dependent changes have been merged and published in downstream modules -- not relevant
Description
Weak symbols have two functions: to permit overriding built-in definitions with custom definitions, and to permit calling a function even if it is not defined.
The former usage is useful for __assert_fail; it allows users to safely override the definition. Unfortunately, the latter usage is actively dangerous: when calling a function, a weak reference is emitted, and if the particular linking order comes out differently than expected, the weakly-referenced symbol may be omitted from the link entirely. In this case, a NOP is inserted instead of a function call.
The dangerous impact of inserting a NOP in the place of __assert_fail is that it is also marked as noreturn. This means that the compiler will have generated code assuming that it would never return, and placed arbitrary other code after this point in the binary output. This silently prevents assertions from triggering (which may not be immediately noticeable), and (even worse) causes the code that fails those asserts to jump into random (and possibly unrelated) other pieces of code.
In order to solve this, we simply need to remove the weak attribute from the declaration of __assert_fail in the header file. The source file can safely continue to carry this attribute, because that will only affect the definition (allowing it to be overridden) without also permitting it to go entirely undefined.
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
I inserted an assert(false); into a critical piece of code in an application. I executed the application, including that piece of code, and used GDB to demonstrate that a NOP was executed instead of a jump to __assert_fail, and that this caused execution to proceed into the code for an unrelated function. I inspected the linked application binary and object files to determine the cause, and noted that the NOP was present in the linked binary, but not in the original object file for the function in question.
After making this change, I re-ran the same tests, and confirmed that the assertion was triggered successfully and the expected assertion error message was reported.
Test Configuration:
Checklist:
I have commented my code, particularly in hard-to-understand areas-- not relevantI have made corresponding changes to the documentation-- not relevantAny dependent changes have been merged and published in downstream modules-- not relevant