Closed huonw closed 10 years ago
Implementing #9546 would fix this.
In theory, LLVM should be able to determine that this null check is unnecessary without additional metadata. There are two separate changes to LLVM's optimizer that are required:
inbounds
GEP either produces a valid pointer into an allocated object or a poison value. In address space 0, there is no allocated object at the zero address. This implies that any inbounds
GEP in address space 0 is a poison value. According to the LLVM LangRef, the icmp
would depend on the poison value, and any instruction that depends on poison values exhibits undefined behavior. However, Dan Gohman (sunfish) tells me that it was only intended to apply to instructions exhibiting externally visible side effects, as otherwise it would mean that any add
instruction could potentially have undefined behavior. Any chain of inbounds
GEPs and phis ending with a load of the inbounds
value should be undefined behavior, because of how poison values behave like undef
. We can't just assume that an inbounds
GEP produces a nonnull value, because then that implies that inbounds
GEPs themselves can have undefined behavior when the runtime value is actually null. Lots of optimizations depend on being able to hoist GEPs, e.g. out of loops, and that wouldn't be possible if they potentially had undefined behavior.Correctly implementing the rule for poison values in LazyValueInfo would be quite difficult, because it requires reasoning about control-dependence with respect to poison values. Also, the cost of making LazyValueInfo optimistic might be too high in compile time to get the patch landed.
In #9546 there is a proposal to add metadata on LLVM instructions that indicates that the instruction produces a nonnull value. There are two reasons why this proposal would be a bit more difficult than it seems at first:
nonnull
attribute on function parameters and return values, because the attribute is erased upon inlining.Another option would be to write a pass that looks for chains of inbounds
GEPs, nonnull
parameters / return values, and phis of these that feed into loads and stores control-dependent on null checks of some intermediate value in the chain. The null checks could be replaced with false, and then hopefully other optimization passes would be able to clean everything up.
Sounds like the last option is the easiest. I also like the fact that it's a separate pass, meaning that if we have trouble getting it upstream we can maintain it in our branch for a while.
I wrote the optimization pass I described. It is able to optimize the first case (with &[uint]
), but it is unable to optimize the second case (with Vec<uint>
). I have an informal inductive argument for why it should still always be nonnull with LLVM IR's poison value semantics, but implementing it as code will be trickier.
Unfortunately, my pass causes the compiled rustc
to segfault when compiling liblibc
. That might be a pain to track down. The problem could be in my code, or rustc
could just be marking a GEP inbounds
when it shouldn't.
I found the issue and put a first cut of my pass up as https://github.com/rust-lang/llvm/pull/13.
\o/ On Jun 28, 2014 12:24 AM, "Cameron Zwarich" notifications@github.com wrote:
I found the issue and put a first cut of my pass up as rust-lang/llvm#13 https://github.com/rust-lang/llvm/pull/13.
— Reply to this email directly or view it on GitHub https://github.com/rust-lang/rust/issues/11751#issuecomment-47420627.
The pass that was landed handles the &[T]
case. There are two obvious remaining things to do:
inbounds
GEP. This corresponds to the Vec<T>
case. In theory, we could add nonnull
metadata to the load (since loads are allowed to have undefined behavior), but applying the LLVM rules inductively actually lets the checks be removed without that.Actually, that second point applies to the IR generated by &[T]
as well. I must've oversimplified it when I made test cases. I'll try to solve that soon then.
I have local changes that fix those two items. In order to make this apply to zip()
, I also need to track dominating conditions from other blocks. This shouldn't be that much more work, so maybe I'll hold out until I have that working.
Those changes are up as https://github.com/rust-lang/llvm/pull/14/. I'll need to add less conservative control dependence checking, and then it should be able to handle arbitrary chaining of zipped iterators.
Compiled with
-O --lib --emit-llvm -S
gives the following. The only major difference between&[]
/Vec
and~[]
are two lines markedTHIS CHECK
, which, we think, is because when constructing an iterator from~[]
we do a pointer offset and dereference, so LLVM knows the pointers are non-null (in the slice/Vec
case, thematch it.next() { None => ... }
part of the for loop isn't removed).