Open scottmcm opened 2 years ago
This is another casualty of LLVM's dumb type-based GEP representation. Both branches write the same values to the same locations, but LLVM doesn't realize this due to the nominally different GEPs. https://github.com/rust-lang/rust/pull/98615 would probably fix this.
This also works
pub fn result_nop_match_64(x: Result<i64, u64>) -> Result<i64, u64> {
match x {
val @ Ok(_) => val,
val @ Err(_) => val,
}
}
@the8472 That doesn't really surprise me, since in that the matching is completely unused -- it's basically just let val = x; val
, and thus has worked since even before LLVM 15: https://rust.godbolt.org/z/G45rx9a3d
(Once it's translated to SSA form, it'll be basically %return = phi [%x,%bb1], [%x,%bb2]
, which instcombine can trivially turn into %return = %x
without needing to make contextual decisions, and then the rest of the stuff is clearly dead.)
I'm glad that LLVM is fixing this. This case can be covered by MIR optimizations, but doing it soundly turns out to be quite tricky. Discussion of the exact semantics surrounding this can be found here, but the tldr is that under the model that Ralf currently prefers (I am not convinced yet), turning
let val = match some_place {
Ok(x) => Ok(x),
Err(x) => Err(x),
};
into let val = some_place;
is unsound in the general case - the problem is that the latter introduces a read of any padding bytes in some_place
which was not there before. As such, the optimization requires proving that such a read is ok to introduce.
For the case in the original issue, this can be proven by showing that there are no valid pointers to those bytes at the time of the read (if we had SSA this would also be straightforward, but that seems far off). If we instead were to write something like this (note the reference):
pub fn result_nop_match_64(x: &Result<i64, u64>) -> Result<i64, u64> {
match *x {
Ok(x) => Ok(x),
Err(x) => Err(x),
}
}
This can be proven sound under current SB. Neither of these analyses results are easily available in MIR right now though.
The issue appears fixed on nightly: https://rust.godbolt.org/z/6o7TPqjbq
So this can probably be closed after updating the codegen test.
Nice! It's been a long way since https://github.com/rust-lang/rust/issues/37939.
If no one shows up, I can create the tests.
@the8472 interestingly, the comment at the top is (partially) outdated — 128 bit integers work just fine! Whoever will be adding tests, please add tests for them too (see https://rust.godbolt.org/z/W4YffGx4q).
However, Option
is still not looking as it should, although this time it's the opposite to this issue — match
is optimal while try { x? }
is suboptimal. Should we open a new issue for that?
P.S. why does llvm-ir still look so complicated? do the optimizations happen on the instruction level, after llvm-ir? P.P.S. why does adding an llvm-ir window completely screw up the asm output again (see here)? is this a godbolt bug or something?...
I encountered an issue that may be related to this one.
I was profiling some code and Option::or
showed up in the profile which is weird.
Looking at the code it seems that suboptimal code is generated for some T
in Option<T>
.
The weirdest thing is that u8
optimizes well but [u8; 1]
doesn't.
There are some sample codegen tests for this in https://github.com/rust-lang/rust/pull/100693/files#diff-68a12d31335d7f061169500381ebaf9634034cf419332c16512355df8c8cfddeR16.
Since this is now just needs-test
, it should be an easy
PR to add the previously-problematic widths from the OP as new codegen tests similar to those other ones.
https://github.com/rust-lang/rust/issues/101210#issuecomment-1732470941 looks different, so I've made a new #124533 for it
Hey @scottmcm, I can open a PR if this is not assigned yet. Just to confirm, we want to add tests for the original issue what has already been fixed in nightly? Thanks!
I can open a PR if this is not assigned yet. Just to confirm, we want to add tests for the original issue what has already been fixed in nightly? Thanks!
@heiseish Yes, but as Waffle mentioned please also include test for 128 bit widths https://rust.godbolt.org/z/W4YffGx4q
@rustbot claim
As the new codegen test confirms, matching and rebuilding a
Result<i32, u32>
optimizes away now (on x64).However, it seemingly doesn't optimize away if the ok and error types are 64 bits:
https://rust.godbolt.org/z/fMn6aqWG8
Interestingly, it does optimize away when there's another enum in the middle, like happens in the
?
desugaring. From that same godbolt,gives