Open nathanchance opened 7 months ago
Missed optimization: https://alive2.llvm.org/ce/z/YDWOu3
I think https://github.com/llvm/llvm-project/pull/69840 may fix it, but unfortunately this PR is blocking :( @nikic Any thoughts?
Ping @nikic
I applied #69840 on top of a4de589d117a4fd52554da3c61ae6eb26c90a0c8. While it does fix the issue in the smaller reproducer, it does not fix the original build issue with the full Linux kernel source file like reverting 3589cacfa8da89b9b5051e4dba659caa575e6b3f does. Here is a reproducer that triggers the error even with #69840 applied but not with 3589cacfa8da89b9b5051e4dba659caa575e6b3f reverted:
struct {
short hci_handledev_class[3];
} sco_sock_getsockopt_old_optlen_cinfo;
int sco_sock_getsockopt_old_sock_0;
long sco_sock_getsockopt_old_optlen___tmp;
void __attribute__((__error__("copy source size is too small")))
__bad_copy_from();
long copy_to_user(void *from, long n) {
void *addr = from;
long bytes = n;
int sz = __builtin_object_size(addr, 0);
if (__builtin_expect(sz < bytes, 0)) {
if (!__builtin_constant_p(bytes))
;
else
__bad_copy_from();
}
return n;
}
int sco_sock_getsockopt_old_optlen() {
int sk = sco_sock_getsockopt_old_sock_0,
len = sco_sock_getsockopt_old_optlen___tmp;
if (sk)
__builtin_choose_expr(0, 0, ({}));
len = __builtin_choose_expr(
0, 0, ({
typeof(0) __UNIQUE_ID___x1646 = len;
unsigned __UNIQUE_ID___y1647 =
sizeof(sco_sock_getsockopt_old_optlen_cinfo);
__UNIQUE_ID___x1646 < __UNIQUE_ID___y1647 ? __UNIQUE_ID___x1646 : 0;
}));
copy_to_user(&sco_sock_getsockopt_old_optlen_cinfo, len);
return 0;
}
$ clang -O2 -Wall -c -o /dev/null sco.i
$ clang -O2 -Wall -fsanitize=thread -c -o /dev/null sco.i
sco.i:16:7: error: call to '__bad_copy_from' declared with 'error' attribute: copy source size is too small
16 | __bad_copy_from();
| ^
1 error generated.
If this appears to be too reduced, I can provide
I applied #69840 on top of a4de589. While it does fix the issue in the smaller reproducer, it does not fix the original build issue with the full Linux kernel source file like reverting 3589cac does. Here is a reproducer that triggers the error even with #69840 applied but not with 3589cac reverted:
struct { short hci_handledev_class[3]; } sco_sock_getsockopt_old_optlen_cinfo; int sco_sock_getsockopt_old_sock_0; long sco_sock_getsockopt_old_optlen___tmp; void __attribute__((__error__("copy source size is too small"))) __bad_copy_from(); long copy_to_user(void *from, long n) { void *addr = from; long bytes = n; int sz = __builtin_object_size(addr, 0); if (__builtin_expect(sz < bytes, 0)) { if (!__builtin_constant_p(bytes)) ; else __bad_copy_from(); } return n; } int sco_sock_getsockopt_old_optlen() { int sk = sco_sock_getsockopt_old_sock_0, len = sco_sock_getsockopt_old_optlen___tmp; if (sk) __builtin_choose_expr(0, 0, ({})); len = __builtin_choose_expr( 0, 0, ({ typeof(0) __UNIQUE_ID___x1646 = len; unsigned __UNIQUE_ID___y1647 = sizeof(sco_sock_getsockopt_old_optlen_cinfo); __UNIQUE_ID___x1646 < __UNIQUE_ID___y1647 ? __UNIQUE_ID___x1646 : 0; })); copy_to_user(&sco_sock_getsockopt_old_optlen_cinfo, len); return 0; }
$ clang -O2 -Wall -c -o /dev/null sco.i $ clang -O2 -Wall -fsanitize=thread -c -o /dev/null sco.i sco.i:16:7: error: call to '__bad_copy_from' declared with 'error' attribute: copy source size is too small 16 | __bad_copy_from(); | ^ 1 error generated.
If this appears to be too reduced, I can provide
I will take a look.
Missing fold: https://alive2.llvm.org/ce/z/zCyK24
define i1 @src(i64 %a) {
%conv = trunc i64 %a to i32
%cmp = icmp ult i32 %conv, 6
%2 = and i64 %a, 7
%cmp.i5 = icmp eq i64 %2, 7
%cmp.i = and i1 %cmp, %cmp.i5
ret i1 %cmp.i
}
define i1 @tgt(i32 %a) {
ret i1 false
}
Emm, it is hard to handle this pattern in LLVM because currently we don't have an efficient QBVF solver :(
As a mitigation, please use long
for all size values to avoid creating unexpected casting instructions. See also https://github.com/jemalloc/jemalloc/pull/2611.
TBH it is not a stable way to assert that the boolean expression always evaluates to true/false. Please use formal verification tools instead :)
cc @nikic @fhahn
unsigned UNIQUEIDy1647 = sizeof(sco_sock_getsockopt_old_cinfo);
Changing the type of __UNIQUE_ID___y1647
to long
fixes this issue.
Emm, it is hard to handle this pattern in LLVM because currently we don't have an efficient QBVF solver :(
As a mitigation, please use
long
for all size values to avoid creating unexpected casting instructions. See also jemalloc/jemalloc#2611.
Thanks for taking a look and the additional information. I have gone ahead and sent https://lore.kernel.org/20240401-bluetooth-fix-len-type-getsockopt_old-v1-1-c6b5448b5374@kernel.org/ for this.
I have more minified and understandable example, which is equal to what @nathanchance provides:
struct rfcomm_conninfo {
short hci_handledev_class[3];
} rfcomm_sock_getsockopt_old_cinfo;
__attribute__((__error__(""))) void __bad_copy_from();
void copy_to_user(void *from, long len) {
int sz = __builtin_object_size(from, 0);
if (__builtin_expect(sz < len, 0))
if (__builtin_constant_p(len))
__bad_copy_from();
}
void rfcomm_sock_getsockopt_old(long foo) {
int len = foo;
len = ({ len < sizeof(rfcomm_sock_getsockopt_old_cinfo) ? len : 0; });
copy_to_user(&rfcomm_sock_getsockopt_old_cinfo, len);
}
Compile with: clang -O2 -fsanitize=thread -c -o test.o test.c
And I think the main source of issue is that optimizer removes __builtin_constant_p(len)
check completely when compiled with -O2
. You can even put !
in front of constness condition and it will still produce error, while on previous clang version it mattered. That hints that the __builtin_constant_p(len)
was not evaluated at all!
@dtcxzyw Could I ask you to have one more look on this bug from @llvm.is.constant
perspective? This condition alone should have been evaluated to false, leaving __bad_copy_from
unreachable.
Before GVNPass:
define dso_local void @rfcomm_sock_getsockopt_old(i64 noundef %0) local_unnamed_addr #0 {
%2 = trunc i64 %0 to i32
%3 = icmp ult i32 %2, 6
%4 = and i64 %0, 7
%5 = select i1 %3, i64 %4, i64 0
%6 = icmp eq i64 %5, 7
br i1 %6, label %7, label %10, !prof !5
7: ; preds = %1
%8 = tail call i1 @llvm.is.constant.i64(i64 %5)
br i1 %8, label %9, label %10
9: ; preds = %7
tail call void (...) @__bad_copy_from() #4, !srcloc !6
br label %10
10: ; preds = %1, %7, %9
ret void
}
After GVN:
define dso_local void @rfcomm_sock_getsockopt_old(i64 noundef %0) local_unnamed_addr #0 {
%2 = trunc i64 %0 to i32
%3 = icmp ult i32 %2, 6
%4 = and i64 %0, 7
%5 = select i1 %3, i64 %4, i64 0
%6 = icmp eq i64 %5, 7
br i1 %6, label %7, label %10, !prof !5
7: ; preds = %1
br i1 true, label %9, label %8
8: ; preds = %7
br label %10
9: ; preds = %7
tail call void (...) @__bad_copy_from() #4, !srcloc !6
br label %10
10: ; preds = %8, %1, %9
ret void
}
%5
is known to be 7. So @llvm.is.constant
should be evaluated to true.
As I said before, the key point is folding the pattern https://github.com/llvm/llvm-project/issues/85647#issuecomment-2029679785. But currently we don't peek through trunc to analysis knownbits...
This is a copy of the downstream report: https://github.com/ClangBuiltLinux/linux/issues/2007
After https://github.com/llvm/llvm-project/commit/3589cacfa8da89b9b5051e4dba659caa575e6b3f, I see the following error when building the Linux kernel for x86_64 using the
allmodconfig
configuration target, which enablesCONFIG_KCSAN
(i.e,-fsanitize=thread
):which comes from
check_copy_size()
incopy_to_user()
. I also see the same error emitted innet/bluetooth/l2cap_sock.c
. If I disable CONFIG_KCSAN, there is no error. The parent of 3589cacfa8da89b9b5051e4dba659caa575e6b3f does not show this error and GCC does not show an error either, so I am not sure the kernel code is at fault here but I could be missing something obvious.cvise
spits out:GCC 13.2.0:
Clang @ https://github.com/llvm/llvm-project/commit/0cbbcf1ef006ce13a1fa94960067723982ae955a:
Clang @ https://github.com/llvm/llvm-project/commit/3589cacfa8da89b9b5051e4dba659caa575e6b3f:
cc @dtcxzyw