Open clubby789 opened 5 months ago
Not sure if this is necessarily A-LLVM actually; it looks like we don't give enough information to codegen to improve this
We actually do:
%0 = alloca ...
%1 = load i8, ptr %0, align 1, !range <range info>
Somewhere in SROA that range information seems to get lost
Codegen also becomes optimal if a and b are references
Looks like this is because of the range metadata on the load: https://godbolt.org/z/bPfezGhW6.
So this may be fixed after LLVM 19, since we'll be able to put range metadata on the by-value arguments.
(range
on arguments has been merged upstream, but optimization support is being implemented right now, so the following isn't optimized yet, but it may be soon: https://godbolt.org/z/17s6jTfMv)
So this may be fixed after LLVM 19, since we'll be able to put range metadata on the by-value arguments.
(
range
on arguments has been merged upstream, but optimization support is being implemented right now, so the following isn't optimized yet, but it may be soon: https://godbolt.org/z/17s6jTfMv)
@erikdesjardins Are you adding relevant support? Or someone has started.
I'm curious why we don't put the range metadata on the non-load instruction.
Reference: https://github.com/llvm/llvm-project/pull/83171.
Yeah, that's the original PR that added support, and there have been a few follow ups so far to have different passes check for range attributes. Since there's still quite a while until LLVM 19, I imagine that support will be added to every pass or analysis that currently uses range metadata by that time.
I'm curious why we don't put the range metadata on the non-load instruction.
LLVM generally only checks for such metadata on loads and calls (e.g.), so I think adding it to other instructions won't have much of an effect.
The only usage that allows any kind of instruction appears to be here in InstSimplify. It's used only to optimize comparisons that are always true or always false.
I'm curious why we don't put the range metadata on the non-load instruction.
LLVM generally only checks for such metadata on loads and calls (e.g.), so I think adding it to other instructions won't have much of an effect.
IIUC, we can add range metadata to other instructions. We just don't do that now. We can quickly get the range of arbitrary values when the source value provides range metadata. Adding range metadata for other instructions doesn't seem necessary if that's correct.
Somewhere in SROA that range information seems to get lost
I'm pretty sure that's because we store to the alloca then load from it again. So a very early pass replaces the load-store with just using the argument, but in doing so doesn't preserve the !range
metadata anywhere.
It'd need to insert an assume
of some sort to keep it, and that has enough other issues that LLVM usually doesn't do that, AFAIK.
Somewhere in SROA that range information seems to get lost
I'm pretty sure that's because we store to the alloca then load from it again. So a very early pass replaces the load-store with just using the argument, but in doing so doesn't preserve the
!range
metadata anywhere.It'd need to insert an
assume
of some sort to keep it, and that has enough other issues that LLVM usually doesn't do that, AFAIK.
I've tried it on https://github.com/rust-lang/rust/pull/122728. I got a huge change in the perf result. ðŸ«
I got a huge change in the perf result. ðŸ«
I'd been thinking about the possibility of LLVM doing the !range
→ assume
translation as part of removing the load
. But yeah, that may well have the same perf result problem.
https://godbolt.org/z/f3TMe3rcW
good
gets optimal codegen since the LLVM 18 bump, butbad
has had poor codegen for some time. Codegen also becomes optimal ifa
andb
are references rather than passed by value.