ClangBuiltLinux / linux

Linux kernel source tree
Other
240 stars 14 forks source link

-Wattribute-warning in drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_bloom_filter.c #1781

Open nathanchance opened 1 year ago

nathanchance commented 1 year ago

After commit f7cd05c76c70 ("fortify: Use __builtin_dynamic_object_size() when available") in -next, I see the following warning in ARCH=s390 allmodconfig:

$ make -skj"$(nproc)" ARCH=s390 CC=clang CROSS_COMPILE=s390x-linux-gnu- O=build mrproper allmodconfig drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_bloom_filter.o
In file included from ../drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_bloom_filter.c:5:
In file included from ../include/linux/gfp.h:7:
In file included from ../include/linux/mmzone.h:8:
In file included from ../include/linux/spinlock.h:63:
In file included from ../include/linux/lockdep.h:14:
In file included from ../include/linux/smp.h:13:
In file included from ../include/linux/cpumask.h:12:
In file included from ../include/linux/bitmap.h:11:
In file included from ../include/linux/string.h:254:
../include/linux/fortify-string.h:430:4: error: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror,-Wattribute-warning]
                        __write_overflow_field(p_size_field, size);
                        ^
../include/linux/fortify-string.h:520:4: error: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror,-Wattribute-warning]
                        __write_overflow_field(p_size_field, size);
                        ^
2 errors generated.
...

This appears to be the memset() and memcpy() calls in __mlxsw_sp_acl_bf_key_encode(), as commenting them out makes the warning disappear.

cc @kees

kees commented 1 year ago

"output" is always sized to one of these, depending on caller:

MLXSW_SP2_BLOOM_KEY_LEN MLXSW_SP4_BLOOM_KEY_LEN

Both callers appears to bounds-check all of these using their respective chunk defines. I don't see anything that would be arch-specific, though.

nathanchance commented 1 year ago

This is now visible in mainline: https://github.com/ClangBuiltLinux/continuous-integration2/actions/runs/4242664339/jobs/7376999640

I can reproduce this with just CONFIG_FORTIFY_SOURCE=y and CONFIG_MLXSW_SPECTRUM=m on s390 but I agree with Kees that I do not immediately see why this would be architecture specific, although it does not show up on any other configurations I build.

I am not sure this is related to #1687, as Nick's changes to LLVM do not resolve this for me (which makes sense, since KASAN is not enable for these configurations).

kees commented 1 year ago

The warning has slightly more detail with the inline tracking patch:

../include/linux/fortify-string.h:520:4: warning: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
                        __write_overflow_field(p_size_field, size);
                        ^
note: In function 'mlxsw_sp4_acl_bf_index_get'
note:   which inlined function 'mlxsw_sp4_acl_bf_key_encode'

But this doesn't tell us anything new, as mlxsw_sp4_acl_bf_key_encode was already known.

kees commented 1 year ago
    112e: a7 29 00 00   lghi    %r2, 0
    1132: a7 39 00 02   lghi    %r3, 2
    1136: c0 e5 00 00 00 00     brasl   %r14, 0x1136
                0000000000001138:  R_390_PLT32DBL       __write_overflow_field+0x2

This shows fortify_memcpy_chk with p_size_field = 0x0 (?!) and size = 0x2.

kees commented 1 year ago

If I grow the sp4 bf_key by 1, p_size_field becomes 1. If I grow it by size, it cycles between needing 2 (sizeof(erp_region_id)) more or 18 (MLXSW_SP4_BLOOM_CHUNK_KEY_BYTES) more. i.e. the 2 memcpys.

Is a loop unroller going too far?

kees commented 1 year ago

This makes the warning go away (?!)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_bloom_filter.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_bloom_filter.c
index e2aced7ab454..0f4419343a17 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_bloom_filter.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_bloom_filter.c
@@ -235,7 +235,7 @@ __mlxsw_sp_acl_bf_key_encode(struct mlxsw_sp_acl_atcam_region *aregion,
                             u8 key_offset, u8 chunk_key_len, u8 chunk_len)
 {
        struct mlxsw_afk_key_info *key_info = aregion->region->key_info;
-       u8 chunk_index, chunk_count, block_count;
+       u8 chunk_index, chunk_count, block_count, chunk_start;
        char *chunk = output;
        __be16 erp_region_id;

@@ -243,7 +243,12 @@ __mlxsw_sp_acl_bf_key_encode(struct mlxsw_sp_acl_atcam_region *aregion,
        chunk_count = 1 + ((block_count - 1) >> 2);
        erp_region_id = cpu_to_be16(aentry->ht_key.erp_id |
                                   (aregion->region->id << 4));
-       for (chunk_index = max_chunks - chunk_count; chunk_index < max_chunks;
+
+       chunk_start = max_chunks - chunk_count;
+       if (WARN_ON_ONCE(chunk_start >= max_chunks))
+               chunk_start = 0;
+
+       for (chunk_index = chunk_start; chunk_index < max_chunks;
             chunk_index++) {
                memset(chunk, 0, pad_bytes);
                memcpy(chunk + pad_bytes, &erp_region_id,

If I make the WARN do chunk_start = max_chunks, it breaks again. It seems the for loop is not getting processed correctly? I'm really baffled.

bwendling commented 1 year ago

If you want to test if it's the loop unroller, you might be able to add something like -mllvm unroll-threshold=0 to the compile flags to see if it still gives the warning...

kees commented 1 year ago

Ta-da! No warning.

make CC=clang KCFLAGS="-mllvm -unroll-threshold=0" LLVM_IAS=1 LD=s390x-linux-gnu-ld O=clang-s390 ARCH=s390 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_bloom_filter.o
make[1]: Entering directory '/srv/code/clang-s390'
  GEN     Makefile
  CALL    ../scripts/checksyscalls.sh
  CC [M]  drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_bloom_filter.o
make[1]: Leaving directory '/srv/code/clang-s390'
bwendling commented 1 year ago

BOO YEAH! :-)

Okay, you could start fiddling with the threshold to see at what point it breaks. I believe the default for -O2 is 150.

nathanchance commented 1 year ago

#pragma clang loop unroll_count(3) shows the warning for me, whereas #pragma clang loop unroll_count(2) does not.

kees commented 1 year ago

max_chunks is always 3, so the loop should never reach 3 loops. Is there some kind of off-by-one in the s390 loop unroller?

bwendling commented 1 year ago

It would be from s390's TargetTransformInfo. I know nothing about that back-end, so can't really say. But it should be fairly straight-forward to debug.

To get an IR file without optimizations applied, add this: -emit-llvm -mllvm -disable-llvm-optzns From there you should be able to use opt to grab the function right before loop unrolling.

kees commented 1 year ago

I have no idea what I'm looking at now. :) Hopefully someone with more familiarity with this can debug it.

bwendling commented 1 year ago

@nathanchance what's the compilation command for that file?

nathanchance commented 1 year ago

@bwendling This reproduces for me with this .i file and the following command:

$ clang --target=s390x-linux-gnu -march=z196 -O2 -c -o /dev/null spectrum_acl_bloom_filter.i
...
In file included from ../drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_bloom_filter.c:5:
In file included from ../include/linux/gfp.h:7:
In file included from ../include/linux/mmzone.h:8:
In file included from ../include/linux/spinlock.h:63:
In file included from ../include/linux/lockdep.h:14:
In file included from ../include/linux/smp.h:13:
In file included from ../include/linux/cpumask.h:12:
In file included from ../include/linux/bitmap.h:11:
In file included from ../include/linux/string.h:254:
../include/linux/fortify-string.h:430:4: warning: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
   __write_overflow_field(p_size_field, size);
   ^
../include/linux/fortify-string.h:520:4: warning: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
   __write_overflow_field(p_size_field, size);
   ^
...
bwendling commented 1 year ago

Reduced a bit:

$ opt --passes='loop(loop-unroll-full)' mlxsw_sp2_acl_bf_index_get.ll -o - | llc -o /dev/null
warning: call to __write_overflow_field marked "dontcall-warn": detected write beyond size of field (1st parameter); maybe use struct_group()?

mlxsw_sp2_acl_bf_index_get.ll.txt

bwendling commented 1 year ago

Quick note: From what I can see, the loop unroller determines that it loops by four, just like @kees thought would happen....I don't know why it's doing that. It doesn't appear to be architecture-specific, since it involves SCEV...

bwendling commented 1 year ago

I looked at the x86 code, and it seems like inlining isn't taking place at some level, so that might be why it's not generating this warning.

bwendling commented 1 year ago

Verified that even on x86 if all of the inlining is done then it will emit the same warning/error:

./include/linux/fortify-string.h:430:4: error: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror,-Wattribute-warning]
                        __write_overflow_field(p_size_field, size);
                        ^
./include/linux/fortify-string.h:520:4: error: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror,-Wattribute-warning]
                        __write_overflow_field(p_size_field, size);
                        ^

This makes me question whether the fortify_mem*_chk functions have a bug in them...

bwendling commented 1 year ago

GCC complains about having to inline that function, but apparently does it. However, it (apparently) doesn't fully unroll the loop, so it doesn't leave the offending functions.

bwendling commented 1 year ago
$ clang -std=gnu11 -O2 -S -o /dev/null red.c
red.c:98:4: warning: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
                        __write_overflow_field(p_size_field, size);
                        ^
2 warnings generated.

Removing one of the memcpy's and it compiles without a warning. It requires both memcpy's. (The memset wasn't needed.)

red.c.txt

bwendling commented 1 year ago

@kees This doesn't solve the issue, but is there a reason why the __write_overflow* and __read_overflow* functions aren't marked as noreturn?

bwendling commented 1 year ago

Ooookaaay...So, I've been staring at this for a while now and my eyes have stopped bleeding enough for me to realize that this might not be a compiler issue but maybe the fortify check itself. The issue seems to be that the if-then statement with __write_overflow_field is causing the loop trip count to artificially increase (in this case to 4). This causes the loop unroller to really mess up the CFG and there's an extra "dead" unroll just kind of hanging about.

ANYWHO, if we could massage the code to omit that extra trip count the code compiles just fine. There's this comment below the offending code:

        /*
         * Warn when writing beyond destination field size.
         *
         * We must ignore p_size_field == 0 for existing 0-element
         * fake flexible arrays, until they are all converted to
         * proper flexible arrays.
         *
         * The implementation of __builtin_*object_size() behaves
         * like sizeof() when not directly referencing a flexible
         * array member, which means there will be many bounds checks
         * that will appear at run-time, without a way for them to be
         * detected at compile-time (as can be done when the destination
         * is specifically the flexible array member).
         * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832
         */
        if (p_size_field != 0 && p_size_field != SIZE_MAX &&
            p_size != p_size_field && p_size_field < size)
                return true;

However, this same check for p_size_field != 0 isn't reflected in the checks above. I modified the code as below to have the check for a non-zero p_size_field and it appears to work.

Thoughts?

diff --git a/include/linux/fortify-string.h b/include/linux/fortify-string.h
index c9de1f59ee80..199a96823a3a 100644
--- a/include/linux/fortify-string.h
+++ b/include/linux/fortify-string.h
@@ -102,10 +102,18 @@ extern char *__underlying_strncpy(char *p, const char *q, __kernel_size_t size)
 #define __member_size(p)       __builtin_object_size(p, 1)
 #endif

-#define __compiletime_lessthan(bounds, length) (       \
-       __builtin_constant_p((bounds) < (length)) &&    \
-       (bounds) < (length)                             \
-)
+#define __compiletime_nonzero(p)                                       \
+({                                                                     \
+       typeof(p) __p = (p);                                            \
+       __builtin_constant_p(__p) && __p != 0;                          \
+})
+
+#define __compiletime_lessthan(bounds, length)                         \
+({                                                                     \
+       typeof(bounds) __b = (bounds);                                  \
+       typeof(length) __l = (length);                                  \
+       __builtin_constant_p(__b < __l) && __b < __l;                   \
+})

 /**
  * strncpy - Copy a string to memory with non-guaranteed NUL padding
@@ -413,7 +421,7 @@ __FORTIFY_INLINE void fortify_memset_chk(__kernel_size_t size,
                                         const size_t p_size,
                                         const size_t p_size_field)
 {
-       if (__builtin_constant_p(size)) {
+       if (__builtin_constant_p(size) && __compiletime_nonzero(size)) {
                /*
                 * Length argument is a constant expression, so we
                 * can perform compile-time bounds checking where
@@ -426,7 +434,8 @@ __FORTIFY_INLINE void fortify_memset_chk(__kernel_size_t size,
                        __write_overflow();

                /* Warn when write size is larger than dest field. */
-               if (__compiletime_lessthan(p_size_field, size))
+               if (__compiletime_nonzero(p_size_field) &&
+                   __compiletime_lessthan(p_size_field, size))
                        __write_overflow_field(p_size_field, size);
        }
        /*
@@ -500,7 +509,7 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
                                         const size_t q_size_field,
                                         const char *func)
 {
-       if (__builtin_constant_p(size)) {
+       if (__builtin_constant_p(size) && __compiletime_nonzero(size)) {
                /*
                 * Length argument is a constant expression, so we
                 * can perform compile-time bounds checking where
@@ -516,7 +525,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
                        __read_overflow2();

                /* Warn when write size argument larger than dest field. */
-               if (__compiletime_lessthan(p_size_field, size))
+               if (__compiletime_nonzero(p_size_field) &&
+                   __compiletime_lessthan(p_size_field, size))
                        __write_overflow_field(p_size_field, size);
                /*
                 * Warn for source field over-read when building with W=1
bwendling commented 1 year ago

And, no, I don't know what GCC doesn't have this issue. My guess is that their loop unrolling algorithm is different enough that it omits the extra loop unroll...

kees commented 1 year ago

Hmm, so I spent a little time looking at this. While the changes to the macro do make the warning go away, this isn't a workable solution because we need to be able to reject zero-sized destination buffers. Do you have a minimized test case for this, by any chance?

kees commented 1 year ago

@kees This doesn't solve the issue, but is there a reason why the __write_overflow* and __read_overflow* functions aren't marked as noreturn?

This is because they don't (yet) call fortify_panic() since we have to be in WARN-only mode until Linus is happy with the amount of bake time (years) for restricting the memcpy()/memset() API like this.