KSPP / linux

Linux kernel source tree (Kernel Self Protection Project)
https://kernsec.org/wiki/index.php/Kernel_Self_Protection_Project
Other
80 stars 5 forks source link

[Feature] Array Access Report #249

Open bwendling opened 1 year ago

bwendling commented 1 year ago

Make a diagnostic of some kind that reports each array access in a build as:

bwendling commented 1 year ago

@kees Could you give some examples of the array accesses? (Specifically the last two.)

kees commented 1 year ago

Sure! Here's a good set to demonstrate it: https://godbolt.org/z/Ga1jd9fvW

/* Build with -Wall -O2 -fstrict-flex-arrays=3 */
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <malloc.h>

#define report_size(p)      do {    \
    const size_t bdos = __builtin_dynamic_object_size(p, 1); \
    \
    if (__builtin_constant_p(bdos)) { \
        if (bdos == SIZE_MAX) { \
            printf(#p " has unknowable size\n"); \
        } else { \
            printf(#p " has a fixed size: %zu\n", bdos); \
        } \
    } else { \
        printf(#p " has a dynamic size: %zu\n", bdos); \
    } \
} while (0)

struct fixed {
    unsigned long flags;
    size_t foo;
    int array[16];
};

/* should emit "fixed" */
void do_fixed(struct fixed *p)
{
    report_size(p->array);
}

struct flex {
    unsigned long flags;
    size_t foo;
    int array[];
};

/* should emit "dynamic" */
void do_dynamic(unsigned char count)
{
    /* malloc() is marked with __attribute__((alloc_size(1))) */
    struct flex *p = malloc(sizeof(*p) + count * sizeof(*p->array));
    report_size(p->array);
    free(p);
}

/* should emit "unknowable" */
void do_unknown(struct flex *p)
{
    report_size(p->array);
}

Note that this must be built with -fstrict-flex-arrays=3 or do_fixed() will fail, and with -O2 or none of the __builtin_dynamic_object_size() will compile.

kees commented 1 year ago

Note that in each case above, accessing p->array[i] should produce the same result as the __builtin_dynamic_object_size() report from report_size().

bwendling commented 1 year ago

Slightly larger example (see https://reviews.llvm.org/D144136):

/* Build with -Wall -O2 -fstrict-flex-arrays=3 -fsanitize=bounds -Rarray-bounds */
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <malloc.h>

#define report_size(p, index)      do {    \
    const size_t bdos = __builtin_dynamic_object_size(p, 1); \
    \
    if (__builtin_constant_p(bdos)) { \
        if (bdos == SIZE_MAX) { \
            printf(#p " has unknowable size\n"); \
        } else { \
            printf(#p " has a fixed size: %zu\n", bdos); \
        } \
    } else { \
        printf(#p " has a dynamic size: %zu\n", bdos); \
    } \
    printf(#p "[" #index "] assignment: %d\n", (p)[index] = 15); \
} while (0)

struct fixed {
    unsigned long flags;
    size_t foo;
    int array[16];
};

/* should emit "fixed" */
void do_fixed(struct fixed *p, int index)
{
    report_size(p->array, 0);
    report_size(p->array, index);
}

struct flex {
    unsigned long flags;
    size_t foo;
    int array[];
};

/* should emit "dynamic" */
void do_dynamic(unsigned char count, int index)
{
    /* malloc() is marked with __attribute__((alloc_size(1))) */
    struct flex *p = malloc(sizeof(*p) + count * sizeof(*p->array));
    report_size(p->array, 0);
    report_size(p->array, index);
    free(p);
}

/* should emit "unknowable" */
void do_unknown(struct flex *p, int index)
{
    report_size(p->array, 0);
    report_size(p->array, index);
}
bwendling commented 1 year ago

One bit of warning about the above program, this bit of code:

  report_size(p->array, index);

won't report a constant value for index even if it's eventually inlined. All inlining decisions are left to Clang's middle end, way after the Sema reporting stuff is finished. The best we'll be able to do is something like:

/home/morbo/llvm/array_access_report.c:32:17: remark: accessing fixed sized array 'int[16]' by index [-Rarray-bounds]
nickdesaulniers commented 1 year ago

Perhaps clang-tidy is the appropriate place to put AST based matchers?

bwendling commented 1 year ago

The changes needed for the report aren't that great. It's basically finding the place in Sema where it's checking these types of pointers and putting a diagnostic there. The hard work of identifying what we want to report on has already been done. :-)