Open paolobarbolini opened 4 years ago
https://github.com/rust-lang/rust/issues/65969 is similar but more complex scenario of this.
This works and has even fewer branches:
pub fn problematic(buf: &[u8]) -> &[u8] {
for i in 0 .. buf.len() {
if buf[i] == 0x00 {
return &buf[i..];
}
}
&[]
}
In my experience all the variables whose bounds checks you want to have eliminated should be derived from by the slice bounds (either directly or through arithmetic). It often doesn't seem to recognize that the value range of another variable in the loop is constrained in such a way that it also falls within the bounds.
See also #73396
In my experience all the variables whose bounds checks you want to have eliminated should be derived from by the slice bounds (either directly or through arithmetic)
I agree @the8472, the original issue was from something like this:
pub fn problematic(buf: &[u8]) -> &[u8] {
for (i, b) in buf.iter().enumerate() {
// this was a matcher, simplified it for this
if *b == 0x00 {
return &buf[i..];
}
}
&[]
}
And the solution was to write it as:
pub fn solution(mut buf: &[u8]) -> &[u8] {
while !buf.is_empty() {
// just simplifying this here, in the final code it's a match checking more things
if buf[0] == 0x00 {
break;
} else {
buf = &buf[1..];
}
}
buf
}
I thought the if i >= buf.len() { break; }
thing would be interesting, but maybe it doesn't matter anyway since seeing from the issue you linked to it has already been fixed in LLVM?
EDIT: maybe not
I tried this code and many variations of it and I found out this doesn't elide the bounds check on
&buf[i..]
.Assembly Output
```assembly example::problematic: pushq %rax leaq .L__unnamed_1(%rip), %rax xorl %edx, %edx testq %rsi, %rsi je .LBB0_6 xorl %ecx, %ecx .LBB0_2: cmpb $0, (%rdi,%rcx) je .LBB0_4 addq $1, %rcx cmpq %rcx, %rsi jne .LBB0_2 .LBB0_6: popq %rcx retq .LBB0_4: cmpq %rsi, %rcx ja .LBB0_7 subq %rcx, %rsi addq %rcx, %rdi movq %rsi, %rdx movq %rdi, %rax popq %rcx retq .LBB0_7: leaq .L__unnamed_2(%rip), %rdx movq %rcx, %rdi callq *core::slice::slice_index_order_fail@GOTPCREL(%rip) ud2 .L__unnamed_3: .ascii "./example.rs" .L__unnamed_2: .quad .L__unnamed_3 .asciz "\f\000\000\000\000\000\000\000\005\000\000\000\025\000\000" .L__unnamed_1: ```Manually checking that
i
is ok elides the bounds check and ends up being optimized away!?Assembly Output
```assembly example::optimized: leaq .L__unnamed_1(%rip), %rax testq %rsi, %rsi je .LBB0_5 movq %rsi, %rdx .LBB0_2: cmpb $0, (%rdi) je .LBB0_3 addq $1, %rdi addq $-1, %rdx jne .LBB0_2 .LBB0_5: xorl %edx, %edx retq .LBB0_3: movq %rdi, %rax retq .L__unnamed_1: ```This third example generates the same assembly as the second one, so I think it's really optimizing away the manual check on the second example?
Assembly Output
```assembly example::optimized_unsafe: leaq .L__unnamed_1(%rip), %rax testq %rsi, %rsi je .LBB0_5 movq %rsi, %rdx .LBB0_2: cmpb $0, (%rdi) je .LBB0_3 addq $1, %rdi addq $-1, %rdx jne .LBB0_2 .LBB0_5: xorl %edx, %edx retq .LBB0_3: movq %rdi, %rax retq .L__unnamed_1: ```Possibly related: #74186