llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.99k stars 11.95k forks source link

function with error attribute not eliminated after commit 3589cacfa8da8 #85647

Open nathanchance opened 7 months ago

nathanchance commented 7 months ago

This is a copy of the downstream report: https://github.com/ClangBuiltLinux/linux/issues/2007

After https://github.com/llvm/llvm-project/commit/3589cacfa8da89b9b5051e4dba659caa575e6b3f, I see the following error when building the Linux kernel for x86_64 using the allmodconfig configuration target, which enables CONFIG_KCSAN (i.e, -fsanitize=thread):

$ make -skj"$(nproc)" ARCH=x86_64 LLVM=1 clean allmodconfig net/bluetooth/sco.o
In file included from net/bluetooth/sco.c:27:
In file included from include/linux/module.h:13:
In file included from include/linux/stat.h:19:
In file included from include/linux/time.h:60:
In file included from include/linux/time32.h:13:
In file included from include/linux/timex.h:67:
In file included from arch/x86/include/asm/timex.h:6:
In file included from arch/x86/include/asm/tsc.h:10:
In file included from arch/x86/include/asm/msr.h:15:
In file included from include/linux/percpu.h:7:
In file included from include/linux/smp.h:118:
include/linux/thread_info.h:244:4: error: call to '__bad_copy_from' declared with 'error' attribute: copy source size is too small
  244 |                         __bad_copy_from();
      |                         ^
1 error generated.

which comes from check_copy_size() in copy_to_user(). I also see the same error emitted in net/bluetooth/l2cap_sock.c. If I disable CONFIG_KCSAN, there is no error. The parent of 3589cacfa8da89b9b5051e4dba659caa575e6b3f does not show this error and GCC does not show an error either, so I am not sure the kernel code is at fault here but I could be missing something obvious.

cvise spits out:

struct {
  short hci_handledev_class[3];
} sco_sock_getsockopt_old_cinfo;
typeof(__builtin_choose_expr(
    sizeof(char), 0,
    __builtin_choose_expr(
        sizeof(short), 0,
        __builtin_choose_expr(sizeof(int), 0,
                              __builtin_choose_expr(sizeof(long), 0, 0)))))
    sco_sock_getsockopt_old___val_gu;
void kmsan_unpoison_memory(void *address) {}
void __attribute__((__error__("copy source size is too small")))
__bad_copy_from();
long copy_to_user(void *from, long n) {
  void *addr = from;
  long bytes = n;
  int sz = __builtin_object_size(addr, 0);
  if (__builtin_expect(sz >= 0 && sz < bytes, 0))
    __bad_copy_from();
  return 0;
}
int sco_sock_getsockopt_old() {
  int len;
  long __tmp = sco_sock_getsockopt_old___val_gu;
  kmsan_unpoison_memory(&__tmp);
  len = __tmp;
  len = __builtin_choose_expr(
      0, 0, ({
        typeof(0) __UNIQUE_ID___x1646 = len;
        unsigned __UNIQUE_ID___y1647 = sizeof(sco_sock_getsockopt_old_cinfo);
        __UNIQUE_ID___x1646 < __UNIQUE_ID___y1647 ? __UNIQUE_ID___x1646 : 0;
      }));
  copy_to_user(&sco_sock_getsockopt_old_cinfo, len);
  return 0;
}

GCC 13.2.0:

$ x86_64-linux-gcc -O2 -Wall -c -o /dev/null sco.i

$ x86_64-linux-gcc -O2 -Wall -fsanitize=thread -c -o /dev/null sco.i

Clang @ https://github.com/llvm/llvm-project/commit/0cbbcf1ef006ce13a1fa94960067723982ae955a:

$ clang -O2 -Wall -c -o /dev/null sco.i

$ clang -O2 -Wall -fsanitize=thread -c -o /dev/null sco.i

Clang @ https://github.com/llvm/llvm-project/commit/3589cacfa8da89b9b5051e4dba659caa575e6b3f:

$ clang -O2 -Wall -c -o /dev/null sco.i

$ clang -O2 -Wall -fsanitize=thread -c -o /dev/null sco.i
sco.i:19:5: error: call to '__bad_copy_from' declared with 'error' attribute: copy source size is too small
   19 |     __bad_copy_from();
      |     ^
1 error generated.

cc @dtcxzyw

dtcxzyw commented 7 months ago

Missed optimization: https://alive2.llvm.org/ce/z/YDWOu3

dtcxzyw commented 7 months ago

I think https://github.com/llvm/llvm-project/pull/69840 may fix it, but unfortunately this PR is blocking :( @nikic Any thoughts?

dtcxzyw commented 7 months ago

Ping @nikic

nathanchance commented 7 months ago

I applied #69840 on top of a4de589d117a4fd52554da3c61ae6eb26c90a0c8. While it does fix the issue in the smaller reproducer, it does not fix the original build issue with the full Linux kernel source file like reverting 3589cacfa8da89b9b5051e4dba659caa575e6b3f does. Here is a reproducer that triggers the error even with #69840 applied but not with 3589cacfa8da89b9b5051e4dba659caa575e6b3f reverted:

struct {
  short hci_handledev_class[3];
} sco_sock_getsockopt_old_optlen_cinfo;
int sco_sock_getsockopt_old_sock_0;
long sco_sock_getsockopt_old_optlen___tmp;
void __attribute__((__error__("copy source size is too small")))
__bad_copy_from();
long copy_to_user(void *from, long n) {
  void *addr = from;
  long bytes = n;
  int sz = __builtin_object_size(addr, 0);
  if (__builtin_expect(sz < bytes, 0)) {
    if (!__builtin_constant_p(bytes))
      ;
    else
      __bad_copy_from();
  }
  return n;
}
int sco_sock_getsockopt_old_optlen() {
  int sk = sco_sock_getsockopt_old_sock_0,
      len = sco_sock_getsockopt_old_optlen___tmp;
  if (sk)
    __builtin_choose_expr(0, 0, ({}));
  len = __builtin_choose_expr(
      0, 0, ({
        typeof(0) __UNIQUE_ID___x1646 = len;
        unsigned __UNIQUE_ID___y1647 =
            sizeof(sco_sock_getsockopt_old_optlen_cinfo);
        __UNIQUE_ID___x1646 < __UNIQUE_ID___y1647 ? __UNIQUE_ID___x1646 : 0;
      }));
  copy_to_user(&sco_sock_getsockopt_old_optlen_cinfo, len);
  return 0;
}
$ clang -O2 -Wall -c -o /dev/null sco.i

$ clang -O2 -Wall -fsanitize=thread -c -o /dev/null sco.i
sco.i:16:7: error: call to '__bad_copy_from' declared with 'error' attribute: copy source size is too small
   16 |       __bad_copy_from();
      |       ^
1 error generated.

If this appears to be too reduced, I can provide

dtcxzyw commented 7 months ago

I applied #69840 on top of a4de589. While it does fix the issue in the smaller reproducer, it does not fix the original build issue with the full Linux kernel source file like reverting 3589cac does. Here is a reproducer that triggers the error even with #69840 applied but not with 3589cac reverted:

struct {
  short hci_handledev_class[3];
} sco_sock_getsockopt_old_optlen_cinfo;
int sco_sock_getsockopt_old_sock_0;
long sco_sock_getsockopt_old_optlen___tmp;
void __attribute__((__error__("copy source size is too small")))
__bad_copy_from();
long copy_to_user(void *from, long n) {
  void *addr = from;
  long bytes = n;
  int sz = __builtin_object_size(addr, 0);
  if (__builtin_expect(sz < bytes, 0)) {
    if (!__builtin_constant_p(bytes))
      ;
    else
      __bad_copy_from();
  }
  return n;
}
int sco_sock_getsockopt_old_optlen() {
  int sk = sco_sock_getsockopt_old_sock_0,
      len = sco_sock_getsockopt_old_optlen___tmp;
  if (sk)
    __builtin_choose_expr(0, 0, ({}));
  len = __builtin_choose_expr(
      0, 0, ({
        typeof(0) __UNIQUE_ID___x1646 = len;
        unsigned __UNIQUE_ID___y1647 =
            sizeof(sco_sock_getsockopt_old_optlen_cinfo);
        __UNIQUE_ID___x1646 < __UNIQUE_ID___y1647 ? __UNIQUE_ID___x1646 : 0;
      }));
  copy_to_user(&sco_sock_getsockopt_old_optlen_cinfo, len);
  return 0;
}
$ clang -O2 -Wall -c -o /dev/null sco.i

$ clang -O2 -Wall -fsanitize=thread -c -o /dev/null sco.i
sco.i:16:7: error: call to '__bad_copy_from' declared with 'error' attribute: copy source size is too small
   16 |       __bad_copy_from();
      |       ^
1 error generated.

If this appears to be too reduced, I can provide

I will take a look.

dtcxzyw commented 7 months ago

Missing fold: https://alive2.llvm.org/ce/z/zCyK24

define i1 @src(i64 %a) {
  %conv = trunc i64 %a to i32
  %cmp = icmp ult i32 %conv, 6
  %2 = and i64 %a, 7
  %cmp.i5 = icmp eq i64 %2, 7
  %cmp.i = and i1 %cmp, %cmp.i5
  ret i1 %cmp.i
}

define i1 @tgt(i32 %a) {
  ret i1 false
}

Emm, it is hard to handle this pattern in LLVM because currently we don't have an efficient QBVF solver :(

As a mitigation, please use long for all size values to avoid creating unexpected casting instructions. See also https://github.com/jemalloc/jemalloc/pull/2611.

TBH it is not a stable way to assert that the boolean expression always evaluates to true/false. Please use formal verification tools instead :)

cc @nikic @fhahn

dtcxzyw commented 7 months ago

unsigned UNIQUEIDy1647 = sizeof(sco_sock_getsockopt_old_cinfo);

Changing the type of __UNIQUE_ID___y1647 to long fixes this issue.

nathanchance commented 7 months ago

Emm, it is hard to handle this pattern in LLVM because currently we don't have an efficient QBVF solver :(

As a mitigation, please use long for all size values to avoid creating unexpected casting instructions. See also jemalloc/jemalloc#2611.

Thanks for taking a look and the additional information. I have gone ahead and sent https://lore.kernel.org/20240401-bluetooth-fix-len-type-getsockopt_old-v1-1-c6b5448b5374@kernel.org/ for this.

noxwell commented 3 weeks ago

I have more minified and understandable example, which is equal to what @nathanchance provides:

struct rfcomm_conninfo {
  short hci_handledev_class[3];
} rfcomm_sock_getsockopt_old_cinfo;

__attribute__((__error__(""))) void __bad_copy_from();

void copy_to_user(void *from, long len) {
  int sz = __builtin_object_size(from, 0);
  if (__builtin_expect(sz < len, 0))
    if (__builtin_constant_p(len))
      __bad_copy_from();
}

void rfcomm_sock_getsockopt_old(long foo) {
  int len = foo;
  len = ({ len < sizeof(rfcomm_sock_getsockopt_old_cinfo) ? len : 0; });
  copy_to_user(&rfcomm_sock_getsockopt_old_cinfo, len);
}

Compile with: clang -O2 -fsanitize=thread -c -o test.o test.c

And I think the main source of issue is that optimizer removes __builtin_constant_p(len) check completely when compiled with -O2. You can even put ! in front of constness condition and it will still produce error, while on previous clang version it mattered. That hints that the __builtin_constant_p(len) was not evaluated at all!

@dtcxzyw Could I ask you to have one more look on this bug from @llvm.is.constant perspective? This condition alone should have been evaluated to false, leaving __bad_copy_from unreachable.

dtcxzyw commented 3 weeks ago

Before GVNPass:

define dso_local void @rfcomm_sock_getsockopt_old(i64 noundef %0) local_unnamed_addr #0 {
  %2 = trunc i64 %0 to i32
  %3 = icmp ult i32 %2, 6
  %4 = and i64 %0, 7
  %5 = select i1 %3, i64 %4, i64 0
  %6 = icmp eq i64 %5, 7
  br i1 %6, label %7, label %10, !prof !5

7:                                                ; preds = %1
  %8 = tail call i1 @llvm.is.constant.i64(i64 %5)
  br i1 %8, label %9, label %10

9:                                                ; preds = %7
  tail call void (...) @__bad_copy_from() #4, !srcloc !6
  br label %10

10:                                               ; preds = %1, %7, %9
  ret void
}

After GVN:

define dso_local void @rfcomm_sock_getsockopt_old(i64 noundef %0) local_unnamed_addr #0 {
  %2 = trunc i64 %0 to i32
  %3 = icmp ult i32 %2, 6
  %4 = and i64 %0, 7
  %5 = select i1 %3, i64 %4, i64 0
  %6 = icmp eq i64 %5, 7
  br i1 %6, label %7, label %10, !prof !5

7:                                                ; preds = %1
  br i1 true, label %9, label %8

8:                                                ; preds = %7
  br label %10

9:                                                ; preds = %7
  tail call void (...) @__bad_copy_from() #4, !srcloc !6
  br label %10

10:                                               ; preds = %8, %1, %9
  ret void
}

%5 is known to be 7. So @llvm.is.constant should be evaluated to true.

As I said before, the key point is folding the pattern https://github.com/llvm/llvm-project/issues/85647#issuecomment-2029679785. But currently we don't peek through trunc to analysis knownbits...