ClangBuiltLinux / linux

Linux kernel source tree
Other
241 stars 14 forks source link

Fortify source failure in drivers/infiniband/core/cma.c with s390 allmodconfig #1687

Open nathanchance opened 2 years ago

nathanchance commented 2 years ago

When building s390 allmodconfig with clang-14 on mainline, I see the following error:

In file included from drivers/infiniband/core/cma.c:9:
In file included from include/linux/completion.h:12:
In file included from include/linux/swait.h:7:
In file included from include/linux/spinlock.h:62:
In file included from include/linux/lockdep.h:14:
In file included from include/linux/smp.h:13:
In file included from include/linux/cpumask.h:12:
In file included from include/linux/bitmap.h:11:
In file included from include/linux/string.h:253:
include/linux/fortify-string.h:414:4: error: call to __read_overflow declared with 'error' attribute: detected read beyond size of object (1st parameter)
                        __read_overflow();
                        ^
1 error generated.

I do not see this with clang-15 or clang-16; git bisect landed on https://github.com/llvm/llvm-project/commit/702d5de4380b1e1554e5b90863093c3a57f76f70 as the "fix" commit but I am not really sure what to make of that.

It is not visible with 5.19. I bisected Linux to commit 925d046e7e52 ("RDMA/core: Add a netevent notifier to cma"). Specifically, it appears to be the

        ips_node = node_from_ndev_ip(&id_table, neigh->dev->ifindex,
                         (struct sockaddr *)&neigh_sock_4);

as I do not see the warning when that is commented. I don't have time to do more triage at the moment.

cc @kees

nathanchance commented 2 years ago

It looks like commit fa35198f3957 ("fortify: Explicitly check bounds are compile-time constants") causes this to become visible even with clang-15 and clang-16:

$ clang --version
clang version 15.0.0 (Fedora 15.0.0-3.fc38)
Target: x86_64-redhat-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

$ make -skj"$(nproc)" ARCH=s390 CC=clang CROSS_COMPILE=s390x-linux-gnu- O=build allmodconfig drivers/infiniband/core/cma.o
In file included from ../drivers/infiniband/core/cma.c:9:
In file included from ../include/linux/completion.h:12:
In file included from ../include/linux/swait.h:7:
In file included from ../include/linux/spinlock.h:63:
In file included from ../include/linux/lockdep.h:14:
In file included from ../include/linux/smp.h:13:
In file included from ../include/linux/cpumask.h:12:
In file included from ../include/linux/bitmap.h:11:
In file included from ../include/linux/string.h:253:
../include/linux/fortify-string.h:525:4: error: call to __read_overflow declared with 'error' attribute: detected read beyond size of object (1st parameter)
                        __read_overflow();
                        ^
1 error generated.
...

I am having a hard time following the code through the casts but based on commenting out the ipv6_addr_cmp() call in compare_netdev_and_ip() and the previously mentioned commenting out of the ipv4 call to node_from_ndev_ip() in cma_netevent_callback(), it seems like maybe if (sa->sa_family == AF_INET) is not triggering, so ipv6_addr_cmp() is getting called with neigh_sock_4?

This appears related to KASAN according to my cvise reduction, as it does not reproduce without -fsanitize=kernel-address. The reduced version is:

struct sockaddr {
  short sa_family;
} *node_from_ndev_ip_sb;
struct in_addr {};
struct sockaddr_in {
  struct in_addr sin_addr;
};
int node_from_ndev_ip_root_0;
void __read_overflow(void) __attribute__((
    __error__("detected read beyond size of object (1st parameter)")));
int memcmp(const void *p, const void *q, unsigned long size) {
  long p_size = __builtin_object_size(p, 0);
  if (p_size < size)
    __read_overflow();
  return 0;
}
void node_from_ndev_ip(struct sockaddr *sa) {
  while (node_from_ndev_ip_root_0) {
    struct sockaddr __trans_tmp_3 = *sa;
    if (__trans_tmp_3.sa_family)
      memcmp(&((struct sockaddr_in *)sa)->sin_addr,
             &((struct sockaddr_in *)node_from_ndev_ip_sb)->sin_addr,
             sizeof(__trans_tmp_3));
  }
}
void cma_netevent_callback() {
  struct sockaddr_in neigh_sock_4;
  node_from_ndev_ip((struct sockaddr *)&neigh_sock_4);
  for (;;)
    ;
}
$ s390x-linux-gnu-gcc --version | head -1
s390x-linux-gnu-gcc (GCC) 12.2.1 20220819 (Red Hat Cross 12.2.1-1)

$ clang --version | head -1
clang version 15.0.0 (Fedora 15.0.0-3.fc38)

$ s390x-linux-gnu-gcc -O2 -c -o /dev/null cma.i

$ clang -O2 --target=s390x-linux-gnu -c -o /dev/null cma.i

$ s390x-linux-gnu-gcc -fsanitize=kernel-address -O2 -c -o /dev/null cma.i

$ clang -fsanitize=kernel-address -O2 --target=s390x-linux-gnu -c -o /dev/null cma.i
cma.i:14:5: error: call to __read_overflow declared with 'error' attribute: detected read beyond size of object (1st parameter)
    __read_overflow();
    ^
1 error generated
nickdesaulniers commented 2 years ago

Thanks for the reduced test case.

Note that struct sockaddr is a curious case that @kees and @GustavoARSilva mentioned recently with a trailing array of specified length.

struct sockaddr_in's sin_addr member is of type struct in_addr, which looks like:

// include/uapi/linux/in.h
 89 struct in_addr {                                                                
 90   __be32  s_addr;                                                               
 91 };

while a struct sockaddr is clearly larger than 4B. So in the reduced case, indeed it seems that the memcmp is indeed over-reading the inputs. Restoring the original third parameter back to sizeof(((struct sockaddr_in *)sa)->sin_addr) makes the reproducer no longer work.

The memcmp is from compare_netdev_and_ip, but doesn't look obviously wrong to me...

Also, the reduce test case is not s390 specific. I can reproduce without the --target= flag on my x86_64 host.

kees commented 2 years ago

Oh, my darling sockaddr. Yes, I have some patches for this, but I haven't sent them yet given the timing of the pending merge window. Here's what I've got, though: https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/hardening&id=2e31e3f9bd18e173b8740ff43cc756ac07a507ed

nathanchance commented 2 years ago

Restoring the original third parameter back to sizeof(((struct sockaddr_in *)sa)->sin_addr) makes the reproducer no longer work.

I will see if I can tighten up the reproducer in that case.

https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/hardening&id=2e31e3f9bd18e173b8740ff43cc756ac07a507ed

I applied this on next-20220929 but it does not seem to fix this particular error.

kees commented 2 years ago

Okay, I think this is the third case of sanitizers messing up __bos(). Or actually, I suspect it might be __builtin_constant_p() that is misbehaving?

First: https://github.com/ClangBuiltLinux/linux/issues/1592 Second: https://lore.kernel.org/lkml/20220920192202.190793-2-keescook@chromium.org/

The furthest I got was confirming an insane value in __bos: https://github.com/ClangBuiltLinux/linux/issues/1592#issuecomment-1157246058

Looking at the reduce case above, I think cvise has wandered the wrong path. Basically, we'll always get a the build error if __read_overflow() hasn't been DCEed. In the fortified memcmp, it is wrapped in if (__builtin_constant_p(size)), so the result of any comparisons must be known at compile-time, so either it'll always __read_overflow() or it'll be DCEed.

If anything tricks __bcp or the DCE, there will be a "leftover" __read_overflow(). Hmm.

kees commented 2 years ago

With this patch on -next:

diff --git a/include/linux/fortify-string.h b/include/linux/fortify-string.h
index 4029fe368a4f..10ab393f89f4 100644
--- a/include/linux/fortify-string.h
+++ b/include/linux/fortify-string.h
@@ -10,7 +10,7 @@
 #define __RENAME(x) __asm__(#x)

 void fortify_panic(const char *name) __noreturn __cold;
-void __read_overflow(void) __compiletime_error("detected read beyond size of object (1st parameter)");
+void __read_overflow(void) __compiletime_warning("detected read beyond size of object (1st parameter)");
 void __read_overflow2(void) __compiletime_error("detected read beyond size of object (2nd parameter)");
 void __read_overflow2_field(size_t avail, size_t wanted) __compiletime_warning("detected read beyond size of field (2nd parameter); maybe use struct_group()?");
 void __write_overflow(void) __compiletime_error("detected write beyond size of object (1st parameter)");
@@ -514,6 +514,8 @@ __FORTIFY_INLINE void *memscan(void * const POS0 p, int c, __kernel_size_t size)
        return __real_memscan(p, c, size);
 }

+extern void emit(size_t wat);
+
 __FORTIFY_INLINE __diagnose_as(__builtin_memcmp, 1, 2, 3)
 int memcmp(const void * const POS0 p, const void * const POS0 q, __kernel_size_t size)
 {
@@ -521,6 +523,8 @@ int memcmp(const void * const POS0 p, const void * const POS0 q, __kernel_size_t
        size_t q_size = __struct_size(q);

        if (__builtin_constant_p(size)) {
+               emit(size);
+               emit(p_size);
                if (__compiletime_lessthan(p_size, size))
                        __read_overflow();
                if (__compiletime_lessthan(q_size, size))

I see this where __read_overflow manifests:

; ./drivers/infiniband/core/cma.c:482
;       if (sa->sa_family == AF_INET)
   147ca: ec 76 00 f4 02 7e     cijlh   %r7, 2, 0x149b2
...
; ./include/linux/fortify-string.h:526
;               emit(size);
   149b2: a7 29 00 10   lghi    %r2, 16
   149b6: c0 e5 00 00 00 00     brasl   %r14, 0x149b6
                00000000000149b8:  R_390_PLT32DBL       emit+0x2
; ./include/linux/fortify-string.h:527
;               emit(p_size);
   149bc: a7 29 00 08   lghi    %r2, 8
   149c0: c0 e5 00 00 00 00     brasl   %r14, 0x149c0
                00000000000149c2:  R_390_PLT32DBL       emit+0x2

size == 16 p_size == 8

The source here, line 482 of drivers/infiniband/core/cma.c is:

        if (sa->sa_family == AF_INET)
                return memcmp((char *)&((struct sockaddr_in *)sa)->sin_addr,
                              (char *)&((struct sockaddr_in *)sb)->sin_addr,
                              sizeof(((struct sockaddr_in *)sa)->sin_addr));

With KASAN disabled, I see:

; ./drivers/infiniband/core/cma.c:482
;       if (sa->sa_family == AF_INET)
    23dc: ec 06 00 1e 02 7e     cijlh   %r0, 2, 0x2418
...
    2418: a7 29 00 10   lghi    %r2, 16
; ./drivers/infiniband/core/cma.c:488
;                            &((struct sockaddr_in6 *)sb)->sin6_addr);
    241c: ec 91 fe 48 00 d9     <unknown>
; ./include/linux/fortify-string.h:526
;               emit(size);
    2422: c0 e5 00 00 00 00     brasl   %r14, 0x2422
                0000000000002424:  R_390_PLT32DBL       emit+0x2
; ./include/linux/fortify-string.h:527
;               emit(p_size);
    2428: a7 29 ff ff   lghi    %r2, -1
    242c: c0 e5 00 00 00 00     brasl   %r14, 0x242c
                000000000000242e:  R_390_PLT32DBL       emit+0x2

size == 16 p_size == SIZE_MAX

kees commented 2 years ago

So, __builtin_object_size((char *)&((struct sockaddr_in *)sa)->sin_addr, 0) is changing from SIZE_MAX to 8. (?!)

It doesn't make any sense. sizeof(struct sockaddr_in) == 16 and sizeof(struct sockaddr) == 16 It's just like #1592 where the __bos output doesn't appear to map to anything.

nathanchance commented 1 year ago

CI sees this problem in stable now on other architectures:

https://github.com/ClangBuiltLinux/continuous-integration2/actions/runs/4074338512

I bisected it to https://github.com/llvm/llvm-project/commit/f1764d5b594ff6edbf8d17d196eb0b7a653ac0fc.

# bad: [603c286334b07f568d39f6706c848f576914f323] Bump the trunk major version to 17
# good: [5265adc73721963d3bf605a7ad5eab6a7e0850b8] [SanitizerBinaryMetadata] Declare callbacks extern weak
git bisect start '603c286334b07f568d39f6706c848f576914f323' '5265adc73721963d3bf605a7ad5eab6a7e0850b8'
# good: [dfb8a0c99d7aae71382d4ca63d7a356663283a76] [InstCombine] add tests for 'not' of cast of cast; NFC
git bisect good dfb8a0c99d7aae71382d4ca63d7a356663283a76
# bad: [94f255c2c4d5c6733819affac5d1acb19e3f5e94] [mlir][tosa] Add RFFT2d operation
git bisect bad 94f255c2c4d5c6733819affac5d1acb19e3f5e94
# good: [e44a305690add9f759d2fba85827bca260c94b41] [InstCombine] invert canonicalization of sext (x > -1) --> not (ashr x)
git bisect good e44a305690add9f759d2fba85827bca260c94b41
# good: [1964c334782e7c5961772fbdcdcc0029cba3a7fa] [LinkerWrapper] Only import static libraries with needed symbols
git bisect good 1964c334782e7c5961772fbdcdcc0029cba3a7fa
# bad: [f1764d5b594ff6edbf8d17d196eb0b7a653ac0fc] [InlineCost] model calls to llvm.objectsize.*
git bisect bad f1764d5b594ff6edbf8d17d196eb0b7a653ac0fc
# good: [5d1dc9fa043f8ee239cbc16347025a4f0831e132] [OpenMP] Do not link the bitcode OpenMP runtime when targeting AMDGPU.
git bisect good 5d1dc9fa043f8ee239cbc16347025a4f0831e132
# good: [7532e88f38a417d3713cb179dc974948c30f0336] [Clang] Add missing requires directives for new test
git bisect good 7532e88f38a417d3713cb179dc974948c30f0336
# first bad commit: [f1764d5b594ff6edbf8d17d196eb0b7a653ac0fc] [InlineCost] model calls to llvm.objectsize.*

It is not visible in mainline but that appears to be due to the fact that KASAN is not enabled in allmodconfig anymore after commit e240e53ae0ab ("mm, slub: add CONFIG_SLUB_TINY").

I will try my best to get a new reproducer tomorrow.

nickdesaulniers commented 1 year ago

https://reviews.llvm.org/D141451 might help us pinpoint this

kees commented 1 year ago

This is a Clang + KASAN bug. It's changing the output of __bos() to be something impossible. :(

nathanchance commented 1 year ago

I am working on getting a trivial reproducer but one observation is that this is not visible in -next with the same configuration. I did a quick reverse bisect and the switch to __builtin_dynamic_object_size() appears to be the fix. I will continue to try and get a reproducer so that __builtin_object_size() can still potentially be fixed.

nathanchance commented 1 year ago

cvise spits out (after my longest reduction ever, 38h 21m 41.9s):

struct sockaddr {
  short sa_family;
};
struct sockaddr_in {
  int sin_addr;
} ipv6_addr_cmp_a2;
int *ipv6_addr_cmp_a1;
int compare_netdev_and_ip_sb_0, compare_netdev_and_ip___trans_tmp_1,
    node_from_ndev_ip_root_0, node_from_ndev_ip_data,
    cma_netevent_callback_neigh;
void fortify_panic(void);
void __read_overflow() __attribute__((__error__("")));
int __underlying_memcmp(void *, void *, long) __asm__("memcmp");
__attribute__((__overloadable__)) int memcmp(void *p, void *q,
                                             unsigned long size) {
  long p_size = __builtin_object_size(p, 0);
  if (p_size < size)
    __read_overflow();
  if (p_size)
    fortify_panic();
  return __underlying_memcmp(p, q, size);
}
int compare_netdev_and_ip(struct sockaddr *sa) {
  if (sa->sa_family)
    return memcmp(&((struct sockaddr_in *)sa)->sin_addr,
                  &compare_netdev_and_ip_sb_0,
                  ((struct sockaddr_in *)sa)->sin_addr);
  compare_netdev_and_ip___trans_tmp_1 =
      memcmp(ipv6_addr_cmp_a1, &ipv6_addr_cmp_a2, sizeof(int));
  return compare_netdev_and_ip___trans_tmp_1;
}
int node_from_ndev_ip(struct sockaddr *sa) {
  while (node_from_ndev_ip_root_0)
    node_from_ndev_ip_data = compare_netdev_and_ip(sa);
  return 0;
}
int cma_netevent_callback() {
  if (cma_netevent_callback_neigh) {
    struct sockaddr_in neigh_sock_4;
    node_from_ndev_ip((struct sockaddr *)&neigh_sock_4);
  }
  return 0;
}

for the following interestingness test (in fish, should be readable still I hope):

#!/usr/bin/env fish

set tmp_dir (dirname (realpath (status dirname)))
set install_folder $tmp_dir/install

set gcc $CBL_TC_STOW_GCC/12.2.0/bin/aarch64-linux-gcc
set bad_clang $install_folder/llvm-bad/bin/clang
set good_clang $install_folder/llvm-good/bin/clang

set clang_flags -Wno-error={gnu,self-assign,unused-value}
set common_flags \
    -Wall \
    -Werror \
    -Wfatal-errors \
    -Wno-error=pointer-sign \
    -c \
    -o /dev/null \
    cma.i
set gcc_flags -Wno-error={{,missing-}attributes,strict-aliasing}

set KASAN -fsanitize=kernel-address
set O1 -O1
set O2 -O2

set fish_trace 1
begin
chronic $gcc $gcc_flags $common_flags $O1
and chronic $gcc $gcc_flags $common_flags $O2
and chronic $good_clang $clang_flags $common_flags $O1
and chronic $good_clang $clang_flags $common_flags $O1 $KASAN
and chronic $good_clang $clang_flags $common_flags $O2
and chronic $good_clang $clang_flags $common_flags $O2 $KASAN
and chronic $bad_clang $clang_flags $common_flags $O2
end; or return

set error "error: call to '__read_overflow' declared with 'error' attribute"

$bad_clang $clang_flags $common_flags $O1 &| grep $error
test "$pipestatus" = "1 0"; or return

$bad_clang $clang_flags $common_flags $O2 $KASAN &| grep $error
test "$pipestatus" = "1 0"; or return

return 0

I noticed that prior to Nick's change, -O1 had no problems but afterwards, it exhibited the same issue as -O2 -fsanitize=kernel-address, so I added that as an additional case to test for. I hope that reproducer is better than the first one but if not, I can try to be a little bit more aggressive with the tests (like more warnings, etc).

nickdesaulniers commented 1 year ago

Memory dump for today.

So, __builtin_object_size((char )&((struct sockaddr_in )sa)->sin_addr, 0) is changing from SIZE_MAX to 8. (?!)

The 8 is coming from (in the IR):

%struct.in_addr = type { i32 }
%struct.sockaddr_in = type { i16, i16, %struct.in_addr, [8 x i8] }

%union.anon.84 = type { [4 x i32] }
%struct.in6_addr = type { %union.anon.84 }
%struct.sockaddr_in6 = type { i16, i16, i32, %struct.in6_addr, i32 }

declare i64 @llvm.objectsize.i64.p0(ptr, i1 immarg, i1 immarg, i1 immarg)

define i64 @foo () {
    %neigh_sock_4 = alloca %struct.sockaddr_in, align 4
    %sin6_addr.i.i20 = getelementptr inbounds %struct.sockaddr_in6, ptr %neigh_sock_4, i64 0, i32 3
    %x = call i64 @llvm.objectsize.i64.p0(ptr %sin6_addr.i.i20, i1 false, i1 true, i1 false)
    ret i64 %x
}

which instcombine is folding %x to i64 8. The GEP of a different type and identifiers of locals leads me to suspect that:

  1. neigh_sock_4 is local to cma_netevent_callback and is a struct sockaddr_in.
  2. The address of neigh_sock_4 is type punned to a struct sockaddr *.
  3. That address is passed to node_from_ndev_ip.
  4. That address is passed to compare_netdev_and_ip.
  5. That address is type punned to struct sockaddr_in6 * in compare_netdev_and_ip.
  6. The address of the offset of a sin6_addr member of a struct sockaddr_in6 derived from the previous pointer is passed to ipv6_addr_cmp.
  7. ipv6_addr_cmp calls memcmp with that address.
  8. The following statements:
    struct sockaddr neigh_sock_4;
    struct sockaddr *sa = &neigh_sock_4;
    return __builtin_object_size(&((struct sockaddr_in6 *)sa)->sin6_addr, 0);

    evaluate to 8.

  9. The size passed to memcmp is sizeof(struct in6_addr) is 16.

So given the statements:

        if (sa->sa_family == AF_INET)
                return memcmp((char *)&((struct sockaddr_in *)sa)->sin_addr,
                              (char *)&((struct sockaddr_in *)sb)->sin_addr,
                              sizeof(((struct sockaddr_in *)sa)->sin_addr));

        return ipv6_addr_cmp(&((struct sockaddr_in6 *)sa)->sin6_addr,
                             &((struct sockaddr_in6 *)sb)->sin6_addr);

it comes down to whether you can optimize one of those out or not. ipv6_addr_cmp is just a call to memcmp. Given the callchain: cma_netevent_callback -> node_from_ndev_ip -> compare_netdev_and_ip it probably depends on whether we inline those enough or not, in order for pass object size/buitin object size to not return -1; KASAN affects inlining decisions since it bloats the code.

Notice also how the condition for ipv4 vs ipv6 is different in cma_netevent_callback than in compare_netdev_and_ip:

static int cma_netevent_callback(struct notifier_block *self,
...
        } else if (neigh->tbl->family == AF_INET) {
                struct sockaddr_in neigh_sock_4;
...
                ips_node = node_from_ndev_ip(&id_table, neigh->dev->ifindex,
...

If I take a config that can repro the failure (fortify+kasan) then disable kasan, and compare the results of KCFLAGS=-Rpass=inline, I get the same inlining decisions wrt the above callchain. I wonder if perhaps something else isn't inlined with KASAN that would let us eliminate that condition.

Either way, I think the compiler is correct. If you follow the callchain cma_netevent_callback -> node_from_ndev_ip -> compare_netdev_and_ip assuming everything has been inlined, for the condition neigh->tbl->family == AF_INET I suspect there is a read overflow via memcmp in compare_netdev_and_ip when sa->sa_family != AF_INET. Instead there are runtime guards that likely make that case impossible to hit, but the compiler can or cannot determine that based on other inlining decisions.


I've added logging to the backend folder https://reviews.llvm.org/D143521 but these are all evaluating to -1. It looks like it's instcombine that can fold @llvm.objectsize as well as lower-is-constant-intrinsic.


https://github.com/llvm/llvm-project/blob/268249f4f14b7dd5755d79f4fb321e409ec5f892/clang/lib/CodeGen/CGExpr.cpp#L746-L747 looks interesting.

nickdesaulniers commented 1 year ago

Either way, I think the compiler is correct.

To test this, I wonder if I manually inline the call chain (cma_netevent_callback -> node_from_ndev_ip -> compare_netdev_and_ip but only for the neigh->tbl->family == AF_INET case) if that always reproduces the problem. Will check that tomorrow. If it does, then the fix is likely to split these functions in two to make them type safe, which will allow us to discard the type punning.

kees commented 1 year ago

I think the repro isn't going to exhibit the correct problem. There are meaningful differences from the original code: the sizeof() in the first memcmp went missing, and struct sockaddr is size 2. This repro should refuse to do that memcmp since it'll run off the end of the object.

Here is the call path in the original code:

static int cma_netevent_callback(struct notifier_block *self,
                                 unsigned long event, void *ctx)
{
...
                struct sockaddr_in neigh_sock_4;
...
                ips_node = node_from_ndev_ip(&id_table, neigh->dev->ifindex,
                                             (struct sockaddr *)&neigh_sock_4);
...
}

static struct id_table_entry *
node_from_ndev_ip(struct rb_root *root, int ifindex, struct sockaddr *sa)
{
...
                result = compare_netdev_and_ip(ifindex, sa, data);
...
}

static int compare_netdev_and_ip(int ifindex_a, struct sockaddr *sa,
                                 struct id_table_entry *entry_b)
{
...
                return memcmp((char *)&((struct sockaddr_in *)sa)->sin_addr,
                              (char *)&((struct sockaddr_in *)sb)->sin_addr,
                              sizeof(((struct sockaddr_in *)sa)->sin_addr));
...
}

#define POS0                    __pass_object_size(0)
#define __struct_size(p)        __builtin_object_size(p, 0)

__FORTIFY_INLINE __diagnose_as(__builtin_memcmp, 1, 2, 3)
int memcmp(const void * const POS0 p, const void * const POS0 q, __kernel_size_t size)
{
        size_t p_size = __struct_size(p);
        size_t q_size = __struct_size(q);
...
        if (size > p_size || size > q_size) {
                         ...freak out...
       }
...
}

struct sockaddr {
        sa_family_t     sa_family;      /* address family, AF_xxx       */
        union {
                char sa_data_min[14];           /* Minimum 14 bytes of protocol address */
                DECLARE_FLEX_ARRAY(char, sa_data);
        };
};

struct sockaddr_in {
  __kernel_sa_family_t  sin_family;     /* Address family               */
  __be16                sin_port;       /* Port number                  */
  struct in_addr        sin_addr;       /* Internet address             */

  /* Pad to size of `struct sockaddr'. */
  unsigned char         __pad[__SOCK_SIZE__ - sizeof(short int) -
                        sizeof(unsigned short int) - sizeof(struct in_addr)];
};

struct in_addr {
        __be32  s_addr;
};

The size argument to memcmp is sizeof(((struct sockaddr_in *)sa)->sin_addr)), which should be, indisputably, 4.

Depending on inlining, BOS will see either struct sockaddr or struct sockaddr_in as the "true" object under the casting done in compare_netdev_and_ip(). However, these two objects are the same size (both are 16 bytes total). So the "size available to the end of the original object" at the sin_addr offset must always be 16 - offsetof(struct sockaddr_in, sin_addr), or 16 - sizeof(sa_family_t) - sizeof(__be16), or 16 - 2 - 2, so 12. The other possible case is where __bos has chickened out completely and failed to track object sizes at all, at which point it should return SIZE_MAX.

So within memcmp, size must always be 4, and p_size and q_size must always be either 12 or SIZE_MAX. In all cases, they are constant expressions.

Looking back at my earlier investigation in https://github.com/ClangBuiltLinux/linux/issues/1687#issuecomment-1263217926 I was seeing seriously unexpected values. Is s390 rounding up these structures somehow? This must be something specific to the architectures since we only see it there.

nathanchance commented 1 year ago

Is s390 rounding up these structures somehow? This must be something specific to the architectures since we only see it there.

You may have missed my earlier comment, CI is seeing this on other architectures now after one of Nick’s recent LLVM changes that would affect inlining.

https://github.com/ClangBuiltLinux/linux/issues/1687#issuecomment-1414530910

That does not affect the rest of your analysis though :)

kees commented 1 year ago

Ah right. Well, that makes it easier to repro, I guess. :|

kees commented 1 year ago

Ah, it seems it's actually the ipv6_addr_cmp() path, FWIW.

kees commented 1 year ago

This is looking more and more legit. sockaddr_in6 is larger than sockaddr ...

struct sockaddr_in6 {
        unsigned short             sin6_family;          /*     0     2 */
        __be16                     sin6_port;            /*     2     2 */
        __be32                     sin6_flowinfo;        /*     4     4 */
        struct in6_addr            sin6_addr;            /*     8    16 */
        __u32                      sin6_scope_id;        /*    24     4 */

        /* size: 28, cachelines: 1, members: 5 */
        /* last cacheline: 28 bytes */
};
kees commented 1 year ago

This needs a mass sockaddr to sockaddr_storage refactor, like I had to do with NFS: commit cf0d7e7f4520814f45e1313872ad5777ed504004 I will give it a shot...

kees commented 1 year ago

Okay, after lots of refactoring ... the error remained. Commenting out various path, here's my current result:

                struct sockaddr_in neigh_sock_4;

                neigh_sock_4.sin_family = AF_INET;
                neigh_sock_4.sin_addr.s_addr = *(__be32 *)(neigh->primary_key);
                ips_node = node_from_ndev_ip(&id_table, neigh->dev->ifindex,
                                             (struct sockaddr_storage *)&neigh_sock_4);
...
static struct id_table_entry *
node_from_ndev_ip(struct rb_root *root, int ifindex, struct sockaddr_storage *sa)
{
...
                result = compare_netdev_and_ip(ifindex, sa, data);
...
}

static int compare_netdev_and_ip(int ifindex_a, struct sockaddr_storage *sa,
                                 struct id_table_entry *entry_b)
{
...
        if (sa->ss_family == AF_INET)
                return memcmp((char *)&((struct sockaddr_in *)sa)->sin_addr,
                              (char *)&((struct sockaddr_in *)sb)->sin_addr,
                              sizeof(((struct sockaddr_in *)sa)->sin_addr));

        return ipv6_addr_cmp(&((struct sockaddr_in6 *)sa)->sin6_addr,
                             &((struct sockaddr_in6 *)sb)->sin6_addr);
}

It seems inlining isn't eliding the != AF_INET case. So it sees that ipv6_addr_cmp() is called on struct sockaddr_in, which is too small. Or inlining reuses compare_netdev_and_ip(), but collapses the size check on only neigh_sock_4 and loses sight of neigh_sock_6?

kees commented 1 year ago

Yeah. This makes the warning go away for me:

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 1d2bff91d78b..60efe7313c94 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -5144,25 +5144,25 @@ static int cma_netevent_callback(struct notifier_block *self,
        struct rdma_id_private *current_id;
        struct neighbour *neigh = ctx;
        unsigned long flags;
+       union {
+               struct sockaddr_in6 _6;
+               struct sockaddr_in  _4;
+       } neigh_sock;

        if (event != NETEVENT_NEIGH_UPDATE)
                return NOTIFY_DONE;

        spin_lock_irqsave(&id_table_lock, flags);
        if (neigh->tbl->family == AF_INET6) {
-               struct sockaddr_in6 neigh_sock_6;
-
-               neigh_sock_6.sin6_family = AF_INET6;
-               neigh_sock_6.sin6_addr = *(struct in6_addr *)neigh->primary_key;
+               neigh_sock._6.sin6_family = AF_INET6;
+               neigh_sock._6.sin6_addr = *(struct in6_addr *)neigh->primary_key;
                ips_node = node_from_ndev_ip(&id_table, neigh->dev->ifindex,
-                                            (struct sockaddr *)&neigh_sock_6);
+                                            (struct sockaddr *)&neigh_sock._6);
        } else if (neigh->tbl->family == AF_INET) {
-               struct sockaddr_in neigh_sock_4;
-
-               neigh_sock_4.sin_family = AF_INET;
-               neigh_sock_4.sin_addr.s_addr = *(__be32 *)(neigh->primary_key);
+               neigh_sock._4.sin_family = AF_INET;
+               neigh_sock._4.sin_addr.s_addr = *(__be32 *)(neigh->primary_key);
                ips_node = node_from_ndev_ip(&id_table, neigh->dev->ifindex,
-                                            (struct sockaddr *)&neigh_sock_4);
+                                            (struct sockaddr *)&neigh_sock._4);
        } else
                goto out;
nathanchance commented 1 year ago

This makes the warning go away for me

Confirmed, I tested -next, mainline, and 6.1 and saw no warnings across any architectures that saw this.

nickdesaulniers commented 1 year ago

This is looking more and more legit. sockaddr_in6 is larger than sockaddr ... So it sees that ipv6_addr_cmp() is called on struct sockaddr_in, which is too small.

Yes, that's what I said. Hopefully my comment helped make that slightly clearer, even if it was verbose:

ipv6_addr_cmp is just a call to memcmp. Either way, I think the compiler is correct. If you follow the callchain cma_netevent_callback -> node_from_ndev_ip -> compare_netdev_and_ip assuming everything has been inlined, for the condition neigh->tbl->family == AF_INET I suspect there is a read overflow via memcmp in compare_netdev_and_ip when sa->sa_family != AF_INET. Instead there are runtime guards that likely make that case impossible to hit, but the compiler can or cannot determine that based on other inlining decisions.

Patch LGTM.

nickdesaulniers commented 1 year ago

I wonder if the code in question should be using struct sockaddr_storage?

kees commented 1 year ago

I remain curious why the assignment/check pair isn't seen by the inliner, but the object size is. Specifically:

The assignment of neigh_sock_4.sin_family = AF_INET; in cma_netevent_callback() is paired with the test if (sa->sa_family == AF_INET) in compare_netdev_and_ip(). Clang either knows the size of sa or it doesn't. If it knows the size of sa how did it forget that sa->sa_family is AF_INET?

nickdesaulniers commented 1 year ago

Because of the type punning w/o use of a union. Maybe this example makes it more clear: https://godbolt.org/z/7Kf1T4ox3 (in this analogy, foo->x = 1; is the assignment of neigh_sock_4.sin_family = AF_INET; in cma_netevent_callback, the first foo->x is the neigh->tbl->family == AF_INET check in cma_netevent_callback, while the second foo->x and ((struct foo_like*)foo)->y is the sa->sa_family == AF_INET check in compare_netdev_and_ip.)

When a human looks close enough, they can see that those alias. LLVM currently does not, so it cannot eliminate what is effectively a repeated subexpression. So if you were depending on the call to oh_no() being DCE'd, I'm sorry Dave. I'm afraid I can't do that.

static int cma_netevent_callback(struct notifier_block *self,
  ...
  if (neigh->tbl->family == AF_INET) {
    // type pun of neigh_sock_4 from struct sockaddr_in to struct sockaddr
    ...
    // inlined from node_from_ndev_ip
    // inlined from compare_netdev_and_ip
    if (sa->sa_family == AF_INET)
kees commented 1 year ago

Why doesn't this work, then?

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 1d2bff91d78b..c43d68cdbdca 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -5157,12 +5157,13 @@ static int cma_netevent_callback(struct notifier_block *self,
                ips_node = node_from_ndev_ip(&id_table, neigh->dev->ifindex,
                                             (struct sockaddr *)&neigh_sock_6);
        } else if (neigh->tbl->family == AF_INET) {
-               struct sockaddr_in neigh_sock_4;
+               struct sockaddr sock;
+               struct sockaddr_in *neigh_sock_4 = (void *)&sock;

-               neigh_sock_4.sin_family = AF_INET;
-               neigh_sock_4.sin_addr.s_addr = *(__be32 *)(neigh->primary_key);
+               sock.sa_family = AF_INET;
+               neigh_sock_4->sin_addr.s_addr = *(__be32 *)(neigh->primary_key);
                ips_node = node_from_ndev_ip(&id_table, neigh->dev->ifindex,
-                                            (struct sockaddr *)&neigh_sock_4);
+                                            &sock);
        } else
                goto out;
kees commented 1 year ago

And here is your godbolt with inlining, and the type punning does work: https://godbolt.org/z/zvMfTEvMM Or here, with the cast separated from the test: https://godbolt.org/z/fn83hf93h

nickdesaulniers commented 1 year ago

LLVM currently does not, so it cannot eliminate what is effectively a repeated subexpression. Maybe this example makes it more clear: https://godbolt.org/z/7Kf1T4ox3

bah, that asm statement I was adding was mucking things up...sorry. I was wrong on that point.


The other case I was looking at yesterday is conditions with repeated pointer chasing. I've fully inlined the relevant call chain to basically:

--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -5126,49 +5126,34 @@ static void cma_netevent_work_handler(struct work_struct *_work)
+extern struct sockaddr *sb;
+
+noinline
 static int cma_netevent_callback(struct notifier_block *self,
                                 unsigned long event, void *ctx)
 {
+  struct neighbour *neigh = ctx;
+
+  if (neigh->tbl->family == AF_INET) {
+    struct sockaddr_in neigh_sock_4 = {
+      .sin_family = AF_INET,
+    };
+
+
+    if (neigh->tbl->family == AF_INET) { //
+      return memcmp((char *)&((struct sockaddr_in *)&neigh_sock_4)->sin_addr,
+          (char *)&((struct sockaddr_in *)sb)->sin_addr,
+          sizeof(((struct sockaddr_in *)&neigh_sock_4)->sin_addr));
+    } else {
+      return ipv6_addr_cmp(&((struct sockaddr_in6 *)&neigh_sock_4)->sin6_addr,
+          &((struct sockaddr_in6 *)sb)->sin6_addr);
+    }
+  }
+  return 0;
 }

extracting that (https://godbolt.org/z/P6Ecjr4Wq), I can see a simple difference between -fsanitize=kernel-address and not. GVNPass doesn't eliminate the second comparison when kasan is enabled. I'm not sure yet whether that's a bug or a feature. ah, https://github.com/llvm/llvm-project/blob/c7c910caee0979d7c5ff883a0c916822bac32426/llvm/lib/Transforms/Scalar/GVN.cpp#L1720-L1722 If I comment out that line, my https://godbolt.org/z/P6Ecjr4Wq case removes the redundant load and drivers/infiniband/core/cma.o compiles without issue.

Looks like that guard was added by https://github.com/llvm/llvm-project/commit/c7810baaa6761fe9771d4b3cba675266e576b19d to address https://github.com/llvm/llvm-project/issues/25924 (https://lists.llvm.org/pipermail/llvm-dev/attachments/20151114/1c7d8dbe/attachment.c was the original reproducer). Perhaps there's another way to solve both problems with a new solution...

nickdesaulniers commented 1 year ago

I'll work on reducing the test cases further tomorrow, but for now I believe these will fix the issue:

  1. https://reviews.llvm.org/D144056
  2. https://reviews.llvm.org/D144057
nickdesaulniers commented 1 year ago

Workaround applied:

commit 876e480da2f7 ("RDMA/cma: Distinguish between sockaddr_in and sockaddr_in6 by size") v6.3-rc1