amazon-ion / ion-c

A C implementation of Amazon Ion.
https://amazon-ion.github.io/ion-docs/
Apache License 2.0
166 stars 43 forks source link

Fix undefined behavior from calling memcmp with NULL arguments #306

Closed nirosys closed 2 years ago

nirosys commented 2 years ago

Issue #, if available: None

Description of changes: _ion_symbol_table_compare_fn, which acts as a comparison function for the ION_INDEX within a symbol table, seems to have been written with the assumption that the provided arguments key1 & key2, which ultimately are ION_SYMBOLS, are not NULL string values.

At the beginning of the function, there are two ASSERTs wrapped in #ifdef DEBUG that tests to ensure the symbols' strings are not null. However, DEBUG is never defined, and ultimately this code sees symbols with null string values.

ub-san identifies the issue with these two errors:

/home/ubuntu/ion-c/ionc/ion_symbol_table.c:2154:22: runtime error: null pointer passed as argument 1, which is declared to never be null
/usr/include/string.h:65:33: note: nonnull attribute specified here
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/ubuntu/ion-c/ionc/ion_symbol_table.c:2154:22 in 
/home/ubuntu/ion-c/ionc/ion_symbol_table.c:2154:41: runtime error: null pointer passed as argument 2, which is declared to never be null
/usr/include/string.h:65:33: note: nonnull attribute specified here
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/ubuntu/ion-c/ionc/ion_symbol_table.c:2154:41 in 

The problem is that memcmp explicitly defines its arguments as being non-NULL. The C99 standard describes a NULL argument to any std library function as being invalid, unless otherwise stated in the documentation for the function. memcmp does not declare any exceptions to that rule.

This situation only arrises when two null symbols are being compared. Since if only one of the symbols was null, the second clause in the if-chain would catch and return the difference of lengths (since a null symbol has length 0). In the event that both symbols are null however, the logic falls through (length1 - length2 == 0) and the function leans on memcmp.

This PR adds another clause after we know the lengths are equal to establish equality when the lengths are zero.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.