KSPP / linux

Linux kernel source tree (Kernel Self Protection Project)
https://kernsec.org/wiki/index.php/Kernel_Self_Protection_Project
Other
80 stars 5 forks source link

Eliminate fake flexible arrays from the kernel ("variable length" one-element and zero-length arrays) #21

Open kees opened 4 years ago

kees commented 4 years ago

Dependent bugs:

There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members” for these cases. The older style of one-element or zero-length arrays should no longer be used.

In older C code, dynamically sized trailing elements were done by specifying a one-element array at the end of a structure:

struct something {
        size_t count;
        struct foo items[1];
};

This led to fragile size calculations via sizeof() (which would need to remove the size of the single trailing element to get a correct size of the “header”). A GNU C extension was introduced to allow for zero-length arrays, to avoid these kinds of size problems:

struct something {
        size_t count;
        struct foo items[0];
};

But this led to other problems, and didn’t solve some problems shared by both styles, like not being able to detect when such an array is accidentally being used not at the end of a structure (which could happen directly, or when such a struct was in unions, structs of structs, etc).

C99 introduced “flexible array members”, which lacks a numeric size for the array declaration entirely:

struct something {
        size_t count;
        struct foo items[];
};

This is the way the kernel expects dynamically sized trailing elements to be declared. It allows the compiler to generate errors when the flexible array does not occur last in the structure, which helps to prevent some kind of undefined behavior bugs from being inadvertently introduced to the codebase. It also allows the compiler to correctly analyze array sizes (via sizeof(), CONFIG_FORTIFY_SOURCE, and CONFIG_UBSAN_BOUNDS). For instance, there is no mechanism that warns us that the following application of the sizeof() operator to a zero-length array always results in zero:

struct something {
        size_t count;
        struct foo items[0];
};

struct something *instance;

instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL);
instance->count = count;

size = sizeof(instance->items) * instance->count;
memcpy(instance->items, source, size);

At the last line of code above, size turns out to be zero, when one might have thought it represents the total size in bytes of the dynamic memory recently allocated for the trailing array items. Here are a couple examples of this issue: link 1, link 2. Instead, flexible array members have incomplete type, and so the sizeof() operator may not be applied, so any misuse of such operators will be immediately noticed at build time.

With respect to one-element arrays, one has to be acutely aware that such arrays occupy at least as much space as a single object of the type, hence they contribute to the size of the enclosing structure. This is prone to error every time people want to calculate the total size of dynamic memory to allocate for a structure containing an array of this kind as a member:

struct something {
        size_t count;
        struct foo items[1];
};

struct something *instance;

instance = kmalloc(struct_size(instance, items, count - 1), GFP_KERNEL);
instance->count = count;

size = sizeof(instance->items) * instance->count;
memcpy(instance->items, source, size);

In the example above, we had to remember to calculate count - 1 when using the struct_size() helper, otherwise we would have –unintentionally– allocated memory for one too many items objects. The cleanest and least error-prone way to implement this is through the use of a flexible array member, instead:

struct something {
        size_t count;
        struct foo items[];
};

struct something *instance;

instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL);
instance->count = count;

size = sizeof(instance->items[0]) * instance->count;
memcpy(instance->items, source, size);
kees commented 4 years ago

treewide patch from Gustavo via Coccinelle: https://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux.git/commit/?h=for-next/fam

kees commented 4 years ago

Additionally, Documentation/process/deprecated.rst should be updated and a test added to scripts/checkpatch.pl.

kees commented 4 years ago

/cc @GustavoARSilva

kees commented 4 years ago

It would be nice if the compiler had a mode to warn about [0] and [1]-sized arrays.

GustavoARSilva commented 4 years ago

testing comments

kees commented 4 years ago

Once all the 1-byte arrays are removed from the kernel, we can change UBSan's -fsanitize=bounds to -fsanitze=bounds-strict for GCC (but not Clang). See https://github.com/KSPP/linux/issues/25

kees commented 4 years ago

It would be nice if the compiler had a mode to warn about [0] and [1]-sized arrays.

Clang supports -Wzero-length-array. I've asked for the same warning in gcc.

kees commented 4 years ago

It would be nice if the compiler had a mode to warn about [0] and [1]-sized arrays.

Clang already supports -Wzero-length-array and GCC is open to adding it: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94428

No real traction on a 1-byte array warning, but adding bounds checking should catch those?

GustavoARSilva commented 4 years ago

Added treewide patch to testing/fam1 for its further 0-day CI testing:

https://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux.git/commit/?h=testing/fam1

Notice that include/uapi/ is excluded for now[1][2].

[1] https://lore.kernel.org/lkml/20200424121553.GE26002@ziepe.ca/ [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1e6e9d0f4859ec698d55381ea26f4136eff3afe1

kees commented 4 years ago

Won't this totally break stuff that is using sizeof() and "- 1" in the wrong places? Or is this just to collect build failure logs?

GustavoARSilva commented 4 years ago

Won't this totally break stuff that is using sizeof() and "- 1" in the wrong places? Or is this just to collect build failure logs?

Yep. The purpose is to collect non-obvious build failures in multiple archs. :) I've been documenting the ones I find during my builds (x86_64, allyesconfig). I'm also auditing every instance of this in order to fix the code that makes use of sizeof() and "- 1" --and other variants-- to calculate the total size of the structure.

GustavoARSilva commented 4 years ago

Additionally, Documentation/process/deprecated.rst should be updated and a test added to scripts/checkpatch.pl.

The documentation patch is ready and waiting for it to be applied: https://lore.kernel.org/lkml/20200608213711.GA22271@embeddedor/