Open loskutov opened 2 years ago
To get memset as in the clang version you can replace
for j in 0..n {
d[i][j] = 0;
}
with
d[i][0..n].fill(0);
The code example is pretty synthetic and doesn't make much sense on its own; however, similar effects can be observed in slightly more complicated programs, e.g. Floyd—Warshall algorithm.
My point is that the missed optimization is pretty important in real-life programs, as it's not uncommon to be unable to avoid array indexing, and bounds-checking overhead is often claimed to be negligible as the checks are likely to be optimized out. And of course I expect them to be optimized out when they are optimized out by clang in an equivalent C++ program.
There are multiple ways to solve the problem, like:
pub fn zero(d: &mut [Vec<i32>]) {
let n = d.len();
for i in 0 .. n {
let row = &mut d[i][.. n];
for j in 0 .. n {
row[j] = 0;
}
}
}
What's the best way to solve this problem depends on the specific situation.
True. However, the issue is about poor codegen for this particular piece of code, not about looking for a way to rewrite it.
Your original code contains an assert, its purpose is to tell something to the optimizer. Often that helps, but you can't be sure the compiler will use the information given by the assert to its fullest. It's a "best effort" for the compiler. So you are waiting for the compiler to do what you want based on "best efforts".
Thanks for including the vector::at
C++ version!
I suggest that you prefer re-slicing to assert
s for avoiding bounds checks. For example, this one works:
pub fn zero(d: &mut [Vec<i32>]) {
let n = d.len();
for i in 0..n {
let di = &mut d[i][..n];
for j in 0..n {
di[j] = 0;
}
}
}
https://rust.godbolt.org/z/f19nasGPh
And that's a general strategy for improving bound-check-elimination. It's how the the new reverse
implementation works, for example (#90821).
Compiler Explorer link: https://godbolt.org/z/KMTfY18Wf
I tried this code:
With
-C llvm-args=-enable-constraint-elimination
, I expected the bound checks to be optimized out as they're performed manually byassert!
. However, both assert and bound checks were present in the compiled code. The inner loop looks really dumb in terms of code generation:rsi != rdx
is the loop invariant, but it's verified on each iteration.At first I suspected the issue to be entirely on the LLVM side, but bound checks are omitted in equivalent C++ code (which is also present at the Compiler Explorer link), and the whole inner loop is replaced with a single
memset
call.Meta
rustc --version --verbose
: