llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.26k stars 12.08k forks source link

InstCombine cannot blindly assume that inttoptr(ptrtoint x) -> x #33896

Open nunoplopes opened 7 years ago

nunoplopes commented 7 years ago
Bugzilla Link 34548
Version trunk
OS All
Blocks llvm/llvm-project#34577 llvm/llvm-project#39193
CC @comex,@majnemer,@vns-mn,@dwblaikie,@efriedma-quic,@fhahn,@hfinkel,@jensmaurer,@dobbelaj-snps,@aqjune,@RKSimon,@Meinersbur,@sunfishcode,@MattPD,@uecker,@RalfJung,@regehr,@rnk,@sanjoy,@rotateright,@yuanfang-chen

Extended Description

Example of an end-to-end miscompilation by clang of the following code involving ptrtoint:

$ cat c.c

include

void f(int, int);

int main() { int a=0, y[1], x = 0; uintptr_t pi = (uintptr_t) &x; uintptr_t yi = (uintptr_t) (y+1); uintptr_t n = pi != yi;

if (n) { a = 100; pi = yi; }

if (n) { a = 100; pi = (uintptr_t) y; }

(int )pi = 15;

printf("a=%d x=%d\n", a, x);

f(&x,y);

return 0; }

$ cat b.c void f(intx, inty) {}

$ clang -O2 c.c b.c -o foo

$ ./foo a=0 x=0

This result is wrong. The two possible outcomes are: a=0 x=15, and a=100 x=0.

The bug is in Instcombine that treats inttoptr(ptrtoint(x)) == x, which is incorrect. This transformation can only be done if x is dereferenceable for the accesses through inttoptr. Compare the following: clang -O0 -S -emit-llvm -o - c.c | opt -S -sroa clang -O0 -S -emit-llvm -o - c.c | opt -S -sroa -instcombine

Integer compares are replaces with pointer compares (wrong) and load/stores are changed from inttoptr to pointers directly (also wrong).

Test case by Gil Hur.

dwblaikie commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#52570

RalfJung commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#46380

aqjune commented 2 years ago

mentioned in issue llvm/llvm-project#39193

aqjune commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#37469

RalfJung commented 2 years ago

mentioned in issue llvm/llvm-project#34577

4584a04b-325b-46e1-813c-87051a6e36f1 commented 3 years ago

For those who have not seen this. WG14 now has a draft TS which addresses this topic:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2676.pdf

RalfJung commented 4 years ago

If some transform isn't consistent with this, please file a bug.

llvm/llvm-bugzilla-archive#46380

efriedma-quic commented 4 years ago

icmp is defined to just take the raw pointer bits as an integer. This is described in LangRef.

Doesn't LLVM also assume that a fresh pointer returned by malloc does not equal any older pointer, and optimize icmp accordingly?

There's a class of optimizations where LLVM implicitly replaces malloc/new with an alloca, and we can optimize comparisons against that. In cases where that optimization isn't legal, LLVM doesn't make any assumptions, as far as I know.

RalfJung commented 4 years ago

icmp is defined to just take the raw pointer bits as an integer. This is described in LangRef.

Doesn't LLVM also assume that a fresh pointer returned by malloc does not equal any older pointer, and optimize icmp accordingly? They could physically be equal if addresses are reused (and that easily happens e.g. with jemalloc). C has a clause that pointers become undef when you deallocate the block they point to, but I have not seen such a clause in the LVLM semantics (but I might have just missed it).

efriedma-quic commented 4 years ago

Not sure whether LLVM's AliasAnalysis interface is only defined for valid pointer representations (ie including one-past-the-last) of just dereferenceable pointers. The latter would be hard for analysis since one had to proof whether a pointer is dereferenceable before an AA query.

The alias() API is defined for any pair of pointers which are both defined at some position in the program. The question is essentially, given two pointer values, for any location in the program where both pointer values are defined, is it possible that a store to one pointer would conflict with a load or store to the other pointer? The possible answers are essentially yes, no, or don't know.

If it in't legal to dereference a pointer, it trivially doesn't alias anything: there are no memory accesses through the pointer, so there isn't any possible conflict. Similarly, if a pointer points to a constant, it doesn't alias anything: there are no stores through it.

The LLVM reference (except getelementptr inbounds) seem to miss concepts of valid pointer representation, pointer provenance or partial ordering of pointers. The ability of BasicAA to deduce pointers as non-aliasing based on provenance seem to be inferred from C/C++ semantics.

icmp is defined to just take the raw pointer bits as an integer. This is described in LangRef. If the text of LangRef describing icmp isn't sufficiently clear, suggestions welcome. If some transform isn't consistent with this, please file a bug.

For provenance, the rules we have are described at http://llvm.org/docs/LangRef.html#pointeraliasing . The problem described here is that the rules don't actually work.

See also http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2311.pdf .

Meinersbur commented 4 years ago

The LLVM reference (except getelementptr inbounds) seem to miss concepts of valid pointer representation, pointer provenance or partial ordering of pointers. The ability of BasicAA to deduce pointers as non-aliasing based on provenance seem to be inferred from C/C++ semantics.

From the semantics of icmp:

If the operands are pointer typed, the pointer values are compared as if they were integers.

Sanjoy Das seem to have worked on non-integral pointer types which cannot be created "from thin air" by disallowing integer conversion instructions. http://llvm.org/docs/LangRef.html#nointptrtype

RalfJung commented 4 years ago

Note that to just allocate another byte past the last element of arrays (comment #​30) for stack variables and globals (and malloc implementations, in particular BumpPtrAllocator) would probably be easiest: It makes the int2ptr function injective. Similar the reason why the size of an empty struct is not zero and malloc(0) returns an unique allocation.

getelementptr without inbounds is allowed to create pointers that are more than "one past the end", and it is likewise allowed to cast those to integers and back. So, in LLVM, I don't think there is any way to make int2ptr injective.

Also see https://people.mpi-sws.org/~jung/twinsem/twinsem.pdf for a detailed discussion of LLVM pointer semantics.

Meinersbur commented 4 years ago

Just mentioning it would be easiest and there is precedence. For stack variables, I initially expected it to only necessary whenever there is pointer arithmetic involving non-constants which is only common for e.g. buffers. However, you're right it might be necessary for any captured variable; especially in C where there a pointer of a scalar variable is an array of size one and pass-by-reference is done by passing the pointer.

There is an aspect of AliasAnalysis that I did not consider: it might only care about dereferenceable pointers. Since the one-past-the-last pointer is not dereferenceable, int2ptr is injective over the set of derefenceable pointers. However, in the original test case, AliasAnalysis is queried about &x and one-past &y. &x is dereferenceable, but one-past &y is not which BasicAA does not consider.

Not sure whether LLVM's AliasAnalysis interface is only defined for valid pointer representations (ie including one-past-the-last) of just dereferenceable pointers. The latter would be hard for analysis since one had to proof whether a pointer is dereferenceable before an AA query.

regehr commented 4 years ago

Michael, my guess is that adding a byte (which of course becomes more than a byte when there are alignment requirements) is a showstopper for memory-constrained embedded use cases. Also note that (I think!) this would have to be done even for stack variables and globals that are not arrays.

Meinersbur commented 4 years ago

Note that to just allocate another byte past the last element of arrays (comment #​30) for stack variables and globals (and malloc implementations, in particular BumpPtrAllocator) would probably be easiest: It makes the int2ptr function injective. Similar the reason why the size of an empty struct is not zero and malloc(0) returns an unique allocation.

Meinersbur commented 4 years ago

I came here from https://blog.regehr.org/archives/1737 and can maybe add another perspective.

It seems that the bug is the result of mixing two concepts of a pointer. First, as a symbolic reference to another value, second as the bit-pattern of a memory address. The former could also be represented as a tuple (baseptr,offset), the baseptr being the pointer's provenance. Pointers with different provenance cannot alias.

The problem being that there exist identical address bit-patterns that represent different logical pointers, namely the address of the pointer one past the baseptr's memory allocation can be another allocation.[*]

In the model of pointer as a logical reference, ptr2int is an implementation-defined non-injective (non-surjective) functions. That is, p!=q does not imply ptr2int(p)!=ptr2int(q) because of the validity of one-past-the-end pointers. Therefore, int2ptr is ambiguous: we cannot infer an unique (baseptr,offset) from the integer value. The conversion int2ptr(ptr2int(p)) therefore is 'lossy'.

AliasAnalysis is linked to pointer comparison in that if we can deduce p+n < q || q+n < p to be unconditionally false, then they do not not alias.

I hope this formalizes a bit what is also called 'magic', 'opaque' and 'non-deterministic' in this thread. Probably it's not just me who got confused.

Clearly, we want the transformation (a != b ? b : a) == b (comment #​17) to eventually take place on the address bit-pattern. I don't see why the final program needs to run slower on pointers than on integers. Note that, on the IR level, this mixes-up provenance regardless whether a or be are the result of int2ptr (comment #​43), so a solution that only handles int2ptr will be insufficient, at least for front-end languages that don't make pointer comparison with different provenance undefined.

I can think of two solutions:

  1. We introduce two phases into the pass pipeline. In the first phase, pointers are logical references with provenance. The identity of pointers must be kept distinct unless it can be shown that they have the same provenance, especially within GVN. Operations such as q - p are undefined (poison?) if q and p have different provenance.

In the second phase, pointers are assumed to be address bit-patterns, i.e. ptr2int is assumed on any pointers. This is a kind of lowering. AliasAnalysis may not assume provenance anymore.

  1. We track provenance explicitly (comment #​54), e.g. using metadata inserted by e.g. the front-end:

    %x = alloca !provenance !​0 %add.ptr = getelementptr inbounds [1 x i32], [1 x i32] %y, i64 0, i64 1 !provenance !​1 %cmp = icmp ne i32 %x, %add.ptr %add.ptr.x = select i1 %cmp, i32 %add.ptr, i32 %x

Since provenance is language concept, it makes sense to not just assume the same thing in the IR, especially since there seem to be differences in C++ and C (comment #​77). If my interpretation is correct, the icmp above is required to be an integer comparison in C11, but can be assumed to be false in C++. In any case, the provenance of %add.ptr.x as to be cleared ("assumed unknown") since it merges pointers of different provenance to a single pointer. Alternatively, provenance could be a set. Stripping all the provenance matadata would be equivalent to entering phase two of the previous solution. Unfortunately, this solution increases the overhead into the direction of ScopedAA.

Note that there is the special case of std::less: To make std::set/std::map work, it must be defined on pointers of different provenance. https://en.cppreference.com/w/cpp/utility/functional/less says:

If the function call operator of the specialization std::less calls a built-in operator comparing pointers, it yields a strict total order even if the built-in operator< does not.

In libc++, I don't see any special mechanism to make on all pointers (https://github.com/llvm/llvm-project/blob/master/libcxx/include/__functional_base#L54). This case might be interesting:

https://godbolt.org/z/mU5nZp

 int x[1], y[1];
 std::set<int*> MySet;
 MySet.insert(&x[1]);
 MySet.insert(&y[0]);
 printf("count = %d\n", (int)MySet.size());

[*] We ignore that within a segmented memory model, or virtual page tables, multiple virtual memory addresses can be mapped to the same physical memory.

llvmbot commented 4 years ago

There are many thoughtful and thought-provoking comments here, thanks for this! But one comment has far-reaching consequences and especially resonated with my older thoughts:

To put it another way, if I have a comparison that gives one answer to some observers and other answers to other observers, there may be nothing that prevents that logical contradiction from infecting the remainder of the execution.

This is a quite powerful principle.

  1. An illustration for its application to comparisons of the form &x == &y + 1 in gcc should be here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502#c42. If clang is ever modified to make such comparisons unstable again I guess it would be applicable to clang too. (But it's straightforward to demonstrate the same with bug 44342 for now if anyone wants.)

  2. Is it possible to have unfrozen undef at all? See bug 44512.

aqjune commented 4 years ago

Any small example test case about how the magic pointer can break things?

There is now a bunch of examples in bug 44313 and bug 44374. Mostly they provide full simple examples for aspects already touched in this bug in some way or another. But the example with restrict adds a surprising twist to the whole mess.

There was also another bug that involves restrict and int-ptr casting that were reported by us: llvm/llvm-project#39193 Conclusion was that we need an intrinsic something like @​llvm.ptr.also_derived_from(base, a, b, c, ...) that can be used for safely replacing a pointer based on pointer equality. llvm/llvm-project#39193 #c8 That was on my to-do list, but I was busy with all other things including development of alive2 with Nuno (https://github.com/aliveToolkit/alive2 ), so its priority was low. Actually, what we've recently found with alive2 was that it is safe to replace a pointer with another in certain cases, e.g.

define i8 @​f(i8 nocapture %p, i8* %q) { if (%p == %q) ret %p // this can be optimized to ret %q ret %q }

(alive2 is still in progress; it does not support inttoptr and noalias yet, so wouldn't check your bug and this inttoptr bug, but I'll be happy if this tool can help determine when it's safe to replace pointer in the future).

I tried to catch LLVM in the act of making pointer comparison non-deterministic, but actually it seems to try hard to not do so.

It did in the past but that was fixed in 2014 -- see bug 13507 and bug 21327.

This is very interesting. So LLVM (at least in 2014) regards such behavior as buggy.

llvmbot commented 4 years ago

Any small example test case about how the magic pointer can break things?

There is now a bunch of examples in bug 44313 and bug 44374. Mostly they provide full simple examples for aspects already touched in this bug in some way or another. But the example with restrict adds a surprising twist to the whole mess.

I tried to catch LLVM in the act of making pointer comparison non-deterministic, but actually it seems to try hard to not do so.

It did in the past but that was fixed in 2014 -- see bug 13507 and bug 21327.

as I talked with Reid and Richard, it seems that the devirtualization has the same problem right now:

https://godbolt.org/g/LS9jc4

It seems there was an effort[1][2] to fix the problem as it applies to devirtualization independently from the general case, and the particular testcase was fixed between clang 6 and 7. I've just filed bug 44472 with new testcases.

[1] https://docs.google.com/document/d/16GVtCpzK8sIHNc2qZz6RN8amICNBtvjWUod2SujZVEo [2] https://www.youtube.com/watch?v=Dt4UehzzcsE

nunoplopes commented 6 years ago

For reference, these slides give an illustrated explanation on this bug: http://llvm.org/devmtg/2018-04/slides/Lopes-Sbirlea-Pointers,%20Alias%20and%20ModRef%20Analyses.pdf

llvmbot commented 7 years ago

It probably happens in practice (why wouldn't it?). I agree it's a potentially risky thing, but I wouldn't claim we cannot fold the comparisons to false.

Going back a step: why do we care about folding "p+n==q" to false? Does this pattern come up often enough that constant folding it is necessary?

I don't think we do. I think the question is: if I have pointers p and q, and p is based on a and q is based on b, and I have p == q, can I fold that to false? If I need to prove first that p != a + n and q != b + m, then it becomes harder.

(I feel like I may be rehashing things that have already been settled, please let me know if that's the case.)

That's what I was asking though -- are there cases where folding "p == q" to "false" (assuming we can prove based_on(p) != based_on(q)) is important?

Are you assuming you can never determine q is a constant? IE these are always variable pointers?

Or are you also considering the case where q is constant?

Because the obvious answer when q is constant nullptr is "null pointer checks".

Past that, this is generally how value profiling would work to remove aliasing, etc, so we should be careful there is some way to make that work (IE comparison of p or q with constant pointers or ranges, usually)

Past all that, when p and q is "guaranteed" to be non-constant, yeah, don't care in practice.

RalfJung commented 7 years ago

By defining the comparison of p+n==q to return a non-deterministic value, it allows the compiler to fold the comparison to false, and the hardware to return true or false.

Going back a step: why do we care about folding "p+n==q" to false? Does this pattern come up often enough that constant folding it is necessary?

I tried to catch LLVM in the act of making pointer comparison non-deterministic, but actually it seems to try hard to not do so. In particular, while it will fold "p+n == q" to false if that's the only use of p and qo, it no longer does so if p or q are escaped: https://godbolt.org/g/RD91og.

So, is this "p+n == q --> false" actually handled similar to foldAllocaCmp? (I've always been wondering why that one should only handle alloca, anyway.)

sanjoy commented 7 years ago

It probably happens in practice (why wouldn't it?). I agree it's a potentially risky thing, but I wouldn't claim we cannot fold the comparisons to false.

Going back a step: why do we care about folding "p+n==q" to false? Does this pattern come up often enough that constant folding it is necessary?

I don't think we do. I think the question is: if I have pointers p and q, and p is based on a and q is based on b, and I have p == q, can I fold that to false? If I need to prove first that p != a + n and q != b + m, then it becomes harder.

(I feel like I may be rehashing things that have already been settled, please let me know if that's the case.)

That's what I was asking though -- are there cases where folding "p == q" to "false" (assuming we can prove based_on(p) != based_on(q)) is important? Why can't we say that since we can't do provenance based comparisons at runtime, pointer comparisons are equivalent to comparing their bitwise representations at runtime? This prevents propagateEquality on pointer types though, but it looks like we're fine with that?

hfinkel commented 7 years ago

It probably happens in practice (why wouldn't it?). I agree it's a potentially risky thing, but I wouldn't claim we cannot fold the comparisons to false.

Going back a step: why do we care about folding "p+n==q" to false? Does this pattern come up often enough that constant folding it is necessary?

I don't think we do. I think the question is: if I have pointers p and q, and p is based on a and q is based on b, and I have p == q, can I fold that to false? If I need to prove first that p != a + n and q != b + m, then it becomes harder.

sanjoy commented 7 years ago

It probably happens in practice (why wouldn't it?). I agree it's a potentially risky thing, but I wouldn't claim we cannot fold the comparisons to false.

Going back a step: why do we care about folding "p+n==q" to false? Does this pattern come up often enough that constant folding it is necessary?

hfinkel commented 7 years ago

The comparision between pointers from different array is Unspecified Behavior, not UB. I spoked about it with Richard a few days ago and he said that they changed wording in a new standard, but the intention was always to have it as Unspecified Behavior. It also just makes sense, compiler should not be legal to insert unreachable when it finds out that pointers from different arrays are compared.

You're right. I am not sure that unspecified behavior is strong enough to let us fold these comparisons away.

Reading the C++17 standard again, I think it is: http://eel.is/c++draft/defns.unspecified unspecified behavior: behavior, for a well-formed program construct and correct data, that depends on the implementation.

(my translation: you can return any value; but you cannot trigger UB otherwise it wouldn't be well-formed)

http://eel.is/c++draft/expr.eq#2.1 (p == q+n) is unspecified

Since it's up to the implementation, I can define the semantics to be "return true/false arbitrarily". We cannot return undef, since it can take different values on each use, but we can return a non-deterministic value (i.e., freeze(poison) in our model). The caveat is that theoretically we need to be careful when duplicating icmps. If the C++ program has a single comparison and then we duplicate it in the IR for some reason, then both of the comparisons would have to return the same value. So if we fold one of the icmps, we would need to guarantee that we will fold all the duplicates.

This is exactly the thing that I don't know how to guarantee, at least not without making them strictly no-duplicate (which would be far worse than not folding in this case). I could have the icmp in a function that is called in a loop. We could then unroll the loop and only inline some of the call sites. Maybe for the inlined call sites we fold the comparison, and for the others we don't. There seem to be lots of inlining/unrolling situations in which this could happen.

Maybe we should do this anyway, but I think some thought will be required to define what this means semantically.

It probably happens in practice (why wouldn't it?). I agree it's a potentially risky thing, but I wouldn't claim we cannot fold the comparisons to false.

llvmbot commented 7 years ago

The comparision between pointers from different array is Unspecified Behavior, not UB. I spoked about it with Richard a few days ago and he said that they changed wording in a new standard, but the intention was always to have it as Unspecified Behavior. It also just makes sense, compiler should not be legal to insert unreachable when it finds out that pointers from different arrays are compared.

You're right. I am not sure that unspecified behavior is strong enough to let us fold these comparisons away.

Reading the C++17 standard again, I think it is: http://eel.is/c++draft/defns.unspecified unspecified behavior: behavior, for a well-formed program construct and correct data, that depends on the implementation.

(my translation: you can return any value; but you cannot trigger UB otherwise it wouldn't be well-formed)

http://eel.is/c++draft/expr.eq#2.1 (p == q+n) is unspecified

Since it's up to the implementation, I can define the semantics to be "return true/false arbitrarily". We cannot return undef, since it can take different values on each use, but we can return a non-deterministic value (i.e., freeze(poison) in our model). The caveat is that theoretically we need to be careful when duplicating icmps. If the C++ program has a single comparison and then we duplicate it in the IR for some reason, then both of the comparisons would have to return the same value. So if we fold one of the icmps, we would need to guarantee that we will fold all the duplicates. It probably happens in practice (why wouldn't it?). I agree it's a potentially risky thing, but I wouldn't claim we cannot fold the comparisons to false.

Yes sorry, you are right, I was thinking of deleting the compare instead of folding. If the optimizer figures out the value of address, then I don't see why it would not be able to fold the compare to constant.

nunoplopes commented 7 years ago

The comparision between pointers from different array is Unspecified Behavior, not UB. I spoked about it with Richard a few days ago and he said that they changed wording in a new standard, but the intention was always to have it as Unspecified Behavior. It also just makes sense, compiler should not be legal to insert unreachable when it finds out that pointers from different arrays are compared.

You're right. I am not sure that unspecified behavior is strong enough to let us fold these comparisons away.

Reading the C++17 standard again, I think it is: http://eel.is/c++draft/defns.unspecified unspecified behavior: behavior, for a well-formed program construct and correct data, that depends on the implementation.

(my translation: you can return any value; but you cannot trigger UB otherwise it wouldn't be well-formed)

http://eel.is/c++draft/expr.eq#2.1 (p == q+n) is unspecified

Since it's up to the implementation, I can define the semantics to be "return true/false arbitrarily". We cannot return undef, since it can take different values on each use, but we can return a non-deterministic value (i.e., freeze(poison) in our model). The caveat is that theoretically we need to be careful when duplicating icmps. If the C++ program has a single comparison and then we duplicate it in the IR for some reason, then both of the comparisons would have to return the same value. So if we fold one of the icmps, we would need to guarantee that we will fold all the duplicates. It probably happens in practice (why wouldn't it?). I agree it's a potentially risky thing, but I wouldn't claim we cannot fold the comparisons to false.

nunoplopes commented 7 years ago

Otherwise we would need to instrument the code to detect this case and return false (since otherwise the comparison in hardware will return true because we'll do a simple integer comparison). By defining the comparison of p+n==q to return a non-deterministic value, it allows the compiler to fold the comparison to false, and the hardware to return true or false.

Exactly. This is what I'd prefer. This may be a nice thing we can't have, however. My concern is that there's some reason we can't do that and still end up with a model that makes sense. To put it another way, if I have a comparison that gives one answer to some observers and other answers to other observers, there may be nothing that prevents that logical contradiction from infecting the remainder of the execution. Moreover, there may be no reasonably way to program the remainder of the program such that actual UB is avoided in the presence of arbitrary logical contractions.

(...) This is pretty cool info; thanks for the pointers. So far I was focusing only on C++ standard, hoping that the C standard would be equal, but apparently not. I agree with your points regarding not being possible to fold comparisons to false if you read the C++ standard to the letter. It's funny that both GCC and LLVM ignore the standard, but MSVC and ICC do not: https://godbolt.org/g/Xbybrz

IMHO, there should be a difference between comparing pointers and pointers casted to integer. And there is in fact. If we want a memory model based on objects and so on, it makes sense to yield false if we compare 2 different objects.

As a technical matter, I'm strongly opposed to any transformations introducing new ptr2int/int2ptr instructions. Unless, that is, we teach AA to deal with them and also deal with the associated integer arithmetic as pointer arithmetic.

I agree with that: we should avoid introducing ptr2int, since it's a very strong construct. We are working on a patch that includes improvements to AA to support ptr2int. We'll share the code & benchmarks results in a few weeks hopefully.

Different topic, but my immediate question is going to be: can we canonicalize away from the ptr2int/int2ptr instead of teaching AA about them?

I only have 2 alternatives at the moment: ptr2int or introduce this new union instruction. We will benchmark both and then report back. There's no way of escaping some sort of int2ptr instruction, because it exists to widen the provenance of pointers: normal pointers can only point to one object, while pointers returned by int2ptr may alias 1, 2, or more. Whatever way we "fix" this, we will always want/need to teach AA about this multiple provenance of some pointers, but that still don't alias with everybody.

llvmbot commented 7 years ago

The comparision between pointers from different array is Unspecified Behavior, not UB. I spoked about it with Richard a few days ago and he said that they changed wording in a new standard, but the intention was always to have it as Unspecified Behavior. It also just makes sense, compiler should not be legal to insert unreachable when it finds out that pointers from different arrays are compared.

You're right. I am not sure that unspecified behavior is strong enough to let us fold these comparisons away.

It does not. By this standard says "language does not give you guarantee that things will be in right order in memory so that the comparision will always produce the same value". It could not be Implementation Defined, because most of the impmentation does not give this guarantee either.

hfinkel commented 7 years ago

The comparision between pointers from different array is Unspecified Behavior, not UB. I spoked about it with Richard a few days ago and he said that they changed wording in a new standard, but the intention was always to have it as Unspecified Behavior. It also just makes sense, compiler should not be legal to insert unreachable when it finds out that pointers from different arrays are compared.

You're right. I am not sure that unspecified behavior is strong enough to let us fold these comparisons away.

hfinkel commented 7 years ago

So, in this model, does this literally mean if the equivalence is not between two pointer type arguments, i can use it?

(IE if i ignore equivalences where Op1->getType()->isPointerTy() && Op2->getType()->isPointerTy(), is that sufficient, or am i misunderstanding?)

Yes, that's correct. (well, not sure about the && part, feels like it should be ||, but probably you cannot propagate equivalences between values of different types?) That was one of the design goals: don't make the common case (propagate/fold integers) more complicated.

if p points to one pass the last byte of one object and q points to another object, the comparison is UB. When it is UB, why can't compiler do anything it pleases?

I thought http://eel.is/c++draft/expr.eq#2.1 said it was unspecified, but I may have misunderstood.

It says that. So we can do what we wish. It would be reasonable, I think, to always fold the comparisons to false.

We can try to fold it to false whenever we spot this case, but we cannot guarantee that it will always return false.

That's what I meant.

Otherwise we would need to instrument the code to detect this case and return false (since otherwise the comparison in hardware will return true because we'll do a simple integer comparison). By defining the comparison of p+n==q to return a non-deterministic value, it allows the compiler to fold the comparison to false, and the hardware to return true or false.

Exactly. This is what I'd prefer. This may be a nice thing we can't have, however. My concern is that there's some reason we can't do that and still end up with a model that makes sense. To put it another way, if I have a comparison that gives one answer to some observers and other answers to other observers, there may be nothing that prevents that logical contradiction from infecting the remainder of the execution. Moreover, there may be no reasonably way to program the remainder of the program such that actual UB is avoided in the presence of arbitrary logical contractions.

I'll also note that there is some related history here. In lib/Analysis/InstructionSimplify.cpp we have:

// A significant optimization not implemented here is assuming that alloca // addresses are not equal to incoming argument values. They don't alias, // as we say, but that doesn't mean they aren't equal, so we take a // conservative approach. // // This is inspired in part by C++11 5.10p1: // "Two pointers of the same type compare equal if and only if they are both // null, both point to the same function, or both represent the same // address." // // This is pretty permissive. // // It's also partly due to C11 6.5.9p6: // "Two pointers compare equal if and only if both are null pointers, both are // pointers to the same object (including a pointer to an object and a // subobject at its beginning) or function, both are pointers to one past the // last element of the same array object, or one is a pointer to one past the // end of one array object and the other is a pointer to the start of a // different array object that happens to immediately follow the first array // object in the address space.) // // C11's version is more restrictive, however there's no reason why an argument // couldn't be a one-past-the-end value for a stack object in the caller and be // equal to the beginning of a stack object in the callee. // // If the C and C++ standards are ever made sufficiently restrictive in this // area, it may be possible to update LLVM's semantics accordingly and reinstate // this optimization. static Constant computePointerICmp(const DataLayout &DL, const TargetLibraryInfo TLI, ... // Various optimizations for (in)equality comparisons. if (Pred == CmpInst::ICMP_EQ || Pred == CmpInst::ICMP_NE) { // Different non-empty allocations that exist at the same time have // different addresses (if the program can tell). Global variables always // exist, so they always exist during the lifetime of each other and all // allocas. Two different allocas usually have different addresses... // // However, if there's an @​llvm.stackrestore dynamically in between two // allocas, they may have the same address. It's tempting to reduce the // scope of the problem by only looking at static allocas here. That would // cover the majority of allocas while significantly reducing the likelihood // of having an @​llvm.stackrestore pop up in the middle. However, it's not // actually impossible for an @​llvm.stackrestore to pop up in the middle of // an entry block. Also, if we have a block that's not attached to a // function, we can't tell if it's "static" under the current definition. // Theoretically, this problem could be fixed by creating a new kind of // instruction kind specifically for static allocas. Such a new instruction // could be required to be at the top of the entry block, thus preventing it // from being subject to a @​llvm.stackrestore. Instcombine could even // convert regular allocas into these special allocas. It'd be nifty. // However, until then, this problem remains open. // // So, we'll assume that two non-empty allocas have different addresses // for now. // // With all that, if the offsets are within the bounds of their allocations // (and not one-past-the-end! so we can't use inbounds!), and their // allocations aren't the same, the pointers are not equal. // // Note that it's not necessary to check for LHS being a global variable // address, due to canonicalization and constant folding. ...

As a technical matter, I'm strongly opposed to any transformations introducing new ptr2int/int2ptr instructions. Unless, that is, we teach AA to deal with them and also deal with the associated integer arithmetic as pointer arithmetic.

I agree with that: we should avoid introducing ptr2int, since it's a very strong construct. We are working on a patch that includes improvements to AA to support ptr2int. We'll share the code & benchmarks results in a few weeks hopefully.

Different topic, but my immediate question is going to be: can we canonicalize away from the ptr2int/int2ptr instead of teaching AA about them?

If we want to go down this route nonetheless, we could use some new instruction that returned some kind of opaque value. The key here is to prevent the introduction of arithmetic on these values and allow us to teach AA about them.

Funny, we have been thinking about opaque values as well :) We don't have a patch ready for this yet nor a formal model of it yet. We'll share more details about it later.

llvmbot commented 7 years ago

The comparision between pointers from different array is Unspecified Behavior, not UB. I spoked about it with Richard a few days ago and he said that they changed wording in a new standard, but the intention was always to have it as Unspecified Behavior. It also just makes sense, compiler should not be legal to insert unreachable when it finds out that pointers from different arrays are compared.

llvmbot commented 7 years ago

So, in this model, does this literally mean if the equivalence is not between two pointer type arguments, i can use it?

(IE if i ignore equivalences where Op1->getType()->isPointerTy() && Op2->getType()->isPointerTy(), is that sufficient, or am i misunderstanding?)

Yes, that's correct. (well, not sure about the && part, feels like it should be ||, but probably you cannot propagate equivalences between values of different types?) That was one of the design goals: don't make the common case (propagate/fold integers) more complicated.

Thanks, that's very helpful to know where to go. As for && vs ||, icmp requires types of operands both be the same anyway. So it would actually suffice to just check one :)

nunoplopes commented 7 years ago

So, in this model, does this literally mean if the equivalence is not between two pointer type arguments, i can use it?

(IE if i ignore equivalences where Op1->getType()->isPointerTy() && Op2->getType()->isPointerTy(), is that sufficient, or am i misunderstanding?)

Yes, that's correct. (well, not sure about the && part, feels like it should be ||, but probably you cannot propagate equivalences between values of different types?) That was one of the design goals: don't make the common case (propagate/fold integers) more complicated.

if p points to one pass the last byte of one object and q points to another object, the comparison is UB. When it is UB, why can't compiler do anything it pleases?

I thought http://eel.is/c++draft/expr.eq#2.1 said it was unspecified, but I may have misunderstood.

It says that. So we can do what we wish. It would be reasonable, I think, to always fold the comparisons to false.

We can try to fold it to false whenever we spot this case, but we cannot guarantee that it will always return false. Otherwise we would need to instrument the code to detect this case and return false (since otherwise the comparison in hardware will return true because we'll do a simple integer comparison). By defining the comparison of p+n==q to return a non-deterministic value, it allows the compiler to fold the comparison to false, and the hardware to return true or false.

As a technical matter, I'm strongly opposed to any transformations introducing new ptr2int/int2ptr instructions. Unless, that is, we teach AA to deal with them and also deal with the associated integer arithmetic as pointer arithmetic.

I agree with that: we should avoid introducing ptr2int, since it's a very strong construct. We are working on a patch that includes improvements to AA to support ptr2int. We'll share the code & benchmarks results in a few weeks hopefully.

If we want to go down this route nonetheless, we could use some new instruction that returned some kind of opaque value. The key here is to prevent the introduction of arithmetic on these values and allow us to teach AA about them.

Funny, we have been thinking about opaque values as well :) We don't have a patch ready for this yet nor a formal model of it yet. We'll share more details about it later.

hfinkel commented 7 years ago

The casts can be elided if the elected pointer is provably dereferenceable for all memory accesses in "use(p)" and "use(q)". There are several reasons for this constraint: 1) either p,q might have been freed already and thus memory access through them is UB; 2) if p = a + n, then "p == q" yields a non-deterministic value (per the C++ semantics: http://eel.is/c++draft/expr.eq#2.1), so we cannot replace q with "a+n" because the non-deterministic value may evaluate to true.

if p points to one pass the last byte of one object and q points to another object, the comparison is UB. When it is UB, why can't compiler do anything it pleases?

I thought http://eel.is/c++draft/expr.eq#2.1 said it was unspecified, but I may have misunderstood.

It says that. So we can do what we wish. It would be reasonable, I think, to always fold the comparisons to false.

In any case, we could probably define it that way, but then icmp instructions would have UB in some cases and we won't be able to freely move them (trivially hoist out of loops etc.). Same situation if we have it return poison or something like poison.

This is not a binary thing. It is UB in the standard, but I suspect that we can define the behavior in this case in a somewhat more benign way.

llvmbot commented 7 years ago

So, in this model, does this literally mean if the equivalence is not between two pointer type arguments, i can use it?

(IE if i ignore equivalences where Op1->getType()->isPointerTy() && Op2->getType()->isPointerTy(), is that sufficient, or am i misunderstanding?)

hfinkel commented 7 years ago

We've been working on a memory model for LLVM and GCC, and while it's not fully finished, we already have a reasonable degree of confidence in it. The changes that are required to LLVM are quite simple:

  • Disable blind folding of int2ptr(ptr2int(x)) -> x
  • Disable some type punning through load/store (implicit ptr2int casts, essentially). We don't have a great solution for this yet; we just know that what LLVM is doing right now isn't correct. (I'll skip this issue for now)

In this model, you can keep propagating integer equalities as before. But you need to be careful when propagating pointer equalities, because just because "p == q", it doesn't mean that you can replace p with q or vice-versa. p,q have equal value if "p == q", but other attributes (like provenance, liveness) are not necessarily equal. A safe version for GVN (that we are benchmarking ATM) is to do the following: if (p == q) { use(p) use(q) } => v = ptr2int(p) w = ptr2int(q) if (v == w) { p' = int2ptr(v) use(p') use(p') }

As a technical matter, I'm strongly opposed to any transformations introducing new ptr2int/int2ptr instructions. Unless, that is, we teach AA to deal with them and also deal with the associated integer arithmetic as pointer arithmetic.

If we want to go down this route nonetheless, we could use some new instruction that returned some kind of opaque value. The key here is to prevent the introduction of arithmetic on these values and allow us to teach AA about them.

This, in some sense, may be equivalent to the union-instruction idea you mentioned, but using an implicit, instead of explicit, representation.

david-xl commented 7 years ago

I thought http://eel.is/c++draft/expr.eq#2.1 said it was unspecified, but I may have misunderstood.

In any case, we could probably define it that way, but then icmp instructions would have UB in some cases and we won't be able to freely move them (trivially hoist out of loops etc.). Same situation if we have it return poison or something like poison.

I agree, I don't think we can make comparisons of pointers to different objects UB, the standard says that the result is "unspecified". Consider that std::map<T*> needs pointers to separate objects to be comparable. It uses std::less, but in practice that lowers to LLVM pointer comparisons.

It does not say comparison of pointers to different objects is unspecified, but comparison of pointer to one complete object with another pointer representing address one past the last element of a different complete object unspecified.

rnk commented 7 years ago

I thought http://eel.is/c++draft/expr.eq#2.1 said it was unspecified, but I may have misunderstood.

In any case, we could probably define it that way, but then icmp instructions would have UB in some cases and we won't be able to freely move them (trivially hoist out of loops etc.). Same situation if we have it return poison or something like poison.

I agree, I don't think we can make comparisons of pointers to different objects UB, the standard says that the result is "unspecified". Consider that std::map<T*> needs pointers to separate objects to be comparable. It uses std::less, but in practice that lowers to LLVM pointer comparisons.

sanjoy commented 7 years ago

The casts can be elided if the elected pointer is provably dereferenceable for all memory accesses in "use(p)" and "use(q)". There are several reasons for this constraint: 1) either p,q might have been freed already and thus memory access through them is UB; 2) if p = a + n, then "p == q" yields a non-deterministic value (per the C++ semantics: http://eel.is/c++draft/expr.eq#2.1), so we cannot replace q with "a+n" because the non-deterministic value may evaluate to true.

if p points to one pass the last byte of one object and q points to another object, the comparison is UB. When it is UB, why can't compiler do anything it pleases?

I thought http://eel.is/c++draft/expr.eq#2.1 said it was unspecified, but I may have misunderstood.

In any case, we could probably define it that way, but then icmp instructions would have UB in some cases and we won't be able to freely move them (trivially hoist out of loops etc.). Same situation if we have it return poison or something like poison.

david-xl commented 7 years ago

We've been working on a memory model for LLVM and GCC, and while it's not fully finished, we already have a reasonable degree of confidence in it. The changes that are required to LLVM are quite simple:

  • Disable blind folding of int2ptr(ptr2int(x)) -> x
  • Disable some type punning through load/store (implicit ptr2int casts, essentially). We don't have a great solution for this yet; we just know that what LLVM is doing right now isn't correct. (I'll skip this issue for now)

In this model, you can keep propagating integer equalities as before. But you need to be careful when propagating pointer equalities, because just because "p == q", it doesn't mean that you can replace p with q or vice-versa. p,q have equal value if "p == q", but other attributes (like provenance, liveness) are not necessarily equal. A safe version for GVN (that we are benchmarking ATM) is to do the following: if (p == q) { use(p) use(q) } => v = ptr2int(p) w = ptr2int(q) if (v == w) { p' = int2ptr(v) use(p') use(p') }

The casts can be elided if the elected pointer is provably dereferenceable for all memory accesses in "use(p)" and "use(q)". There are several reasons for this constraint: 1) either p,q might have been freed already and thus memory access through them is UB; 2) if p = a + n, then "p == q" yields a non-deterministic value (per the C++ semantics: http://eel.is/c++draft/expr.eq#2.1), so we cannot replace q with "a+n" because the non-deterministic value may evaluate to true.

if p points to one pass the last byte of one object and q points to another object, the comparison is UB. When it is UB, why can't compiler do anything it pleases?

Another option for GVN is to introduce a new "union p, q" instruction and replace use(p) with use(union p, q). This new instruction would do the meet operation between p, q (think of the pointer lattice with provenances). The solution above with int2ptr/ptr2int round-trip essentially returns Top from this lattice (i.e., the returned pointer can alias with all the pointers that have been ptr2int'ed so far that appear in the control/data-flow reaching p,q). So this union instruction yields a more precise result, but I dunno how important that is in practice (under study).

I think you hit the essence of the issue -- it is not GVN, but lack of IR support to update static properties/attributes.

Your first suggestion to use int2ptr/ptr2int round trip is like 'giving up' those properties completely. The 'union' instruction idea is more precise but too heavy weight. There might be other ways to merge value properties ..

In summary, if we have: p = (int)malloc(8) v = ptr2int(p) + 4; q = int2ptr(v) q = 3;

we can optimize into: p[1] = 3

This is sound because we can see through the casts that q must point to the object starting at p at that object is still alive. If we don't know anything about p, then the optimization can't be done.

(GCC is more conservative than LLVM right now with respect ptr2int, but it also needs similar tweaks to become correct)

nunoplopes commented 7 years ago

We've been working on a memory model for LLVM and GCC, and while it's not fully finished, we already have a reasonable degree of confidence in it. The changes that are required to LLVM are quite simple:

In this model, you can keep propagating integer equalities as before. But you need to be careful when propagating pointer equalities, because just because "p == q", it doesn't mean that you can replace p with q or vice-versa. p,q have equal value if "p == q", but other attributes (like provenance, liveness) are not necessarily equal. A safe version for GVN (that we are benchmarking ATM) is to do the following: if (p == q) { use(p) use(q) } => v = ptr2int(p) w = ptr2int(q) if (v == w) { p' = int2ptr(v) use(p') use(p') }

The casts can be elided if the elected pointer is provably dereferenceable for all memory accesses in "use(p)" and "use(q)". There are several reasons for this constraint: 1) either p,q might have been freed already and thus memory access through them is UB; 2) if p = a + n, then "p == q" yields a non-deterministic value (per the C++ semantics: http://eel.is/c++draft/expr.eq#2.1), so we cannot replace q with "a+n" because the non-deterministic value may evaluate to true.

Another option for GVN is to introduce a new "union p, q" instruction and replace use(p) with use(union p, q). This new instruction would do the meet operation between p, q (think of the pointer lattice with provenances). The solution above with int2ptr/ptr2int round-trip essentially returns Top from this lattice (i.e., the returned pointer can alias with all the pointers that have been ptr2int'ed so far that appear in the control/data-flow reaching p,q). So this union instruction yields a more precise result, but I dunno how important that is in practice (under study).

In summary, if we have: p = (int)malloc(8) v = ptr2int(p) + 4; q = int2ptr(v) q = 3;

we can optimize into: p[1] = 3

This is sound because we can see through the casts that q must point to the object starting at p at that object is still alive. If we don't know anything about p, then the optimization can't be done.

(GCC is more conservative than LLVM right now with respect ptr2int, but it also needs similar tweaks to become correct)

llvmbot commented 7 years ago

Okay, so i kind of need to try to push this forward a bit and see if there is any consensus.

One of the things on my plate is to match the equivalence handling of gvn in newgvn, in a bunch of cases. A bunch depend on equality/inequality of variables.

If we are going to rip out all the equivalence handling in the compiler to make these cases work (last i looked, it was in lvi, simplifyinstruction, gvn, and newgvn, but i didn't look harder), i don't want to spend time to do that work :)

(Otherwise, i want to know where we want to go, in particular:

  1. Do nothing at all. Make sad face.
  2. Do nothing with equivalences, work on memory model
  3. Rip out equivalences from comparisons between variables and ban them
  4. Rip out all equivalence handling.

Note that #​3 working in all cases depends on the compiler never discovering the integer value of a pointer and propagating it. It also relies on never detecting equivalences due to, for example, equal singleton ranges.

IE If you derive an equivalence from the following that the variables are equal, things break just the same:

if (pi > 48 && pi < 50) if (yi > 48 && yi < 50)

(pi == yi inside here)

same with the anti-range version: if (pi < 48 || pi > 50) if (yi < 48 || yi > 50)

(pi != yi inside here)

I would not put money on us being able to prevent that from happening.

llvmbot commented 7 years ago

The issue with invariant.group is that by replacing one pointer with another we skip the invariant.group and reference the dead pointer.

The problem is not that the compiler replaces the reference to the dead pointer. The real problem is that the original annotation based on the replaced pointer is no longer valid. In this case, it looks like the invariant property gets extended across the barrier because the invariant group annotation and a new base pointer.

By fix I meant to not perform replacement based on the comparision if the operands are not constant.

Blocking compiler from doing optimization based on value equivalence is not acceptable, as pointed out by Danny.

dropping invariant.groups would be also valid after such optimization, but the problem here is that you can't really do that across different calls

if (a == b) call(b); // this might have invariant.group loads inside.

This does matter.

Meant to say 'does not matter'

The original loads in 'call' are through the function parameter which is a different IR entity. Do you see how it can gets miscompile without dropping the annotation?

This is not Alias Analysis issue, aliasing does not do anything here - we do not replace pointer with another based on aliasing info.

It is directly related to AA. Without the wrong invariance, AA can tell that the call to B's ctor will alias with the load.

llvmbot commented 7 years ago

The issue with invariant.group is that by replacing one pointer with another we skip the invariant.group and reference the dead pointer.

The problem is not that the compiler replaces the reference to the dead pointer. The real problem is that the original annotation based on the replaced pointer is no longer valid. In this case, it looks like the invariant property gets extended across the barrier because the invariant group annotation and a new base pointer.

By fix I meant to not perform replacement based on the comparision if the operands are not constant.

Blocking compiler from doing optimization based on value equivalence is not acceptable, as pointed out by Danny.

dropping invariant.groups would be also valid after such optimization, but the problem here is that you can't really do that across different calls

if (a == b) call(b); // this might have invariant.group loads inside.

This does matter. The original loads in 'call' are through the function parameter which is a different IR entity. Do you see how it can gets miscompile without dropping the annotation?

This is not Alias Analysis issue, aliasing does not do anything here - we do not replace pointer with another based on aliasing info.

It is directly related to AA. Without the wrong invariance, AA can tell that the call to B's ctor will alias with the load.

llvmbot commented 7 years ago

The issue with invariant.group is that by replacing one pointer with another we skip the invariant.group and reference the dead pointer. By fix I meant to not perform replacement based on the comparision if the operands are not constant.

This will not actually fix anything, particularly when you mix it with inttoptr, because that constant may in fact be a constant that is really a pointer :) (if you don't believe it's possible to make that bug still happen, really, just look at the original testcase and be createive).

This is pretty well established by GCC bugs (and our bugs) at this point.

The only thing you could do to avoid it in all cases would be to drop any equivalences gleaned from comparisons in all passes and simplifications (GCC came to the same conclusion).

IMHo, this is a super-bad idea, as they are responsible for a lot of performance.

As mentioned previously, if we view the "replace pointer with pointer" and "replace thing that was/becomes a pointer through inttoptr" as two different issues - the first is pretty easy to solve by either

  1. passing along the implicit info so we accurately reflect the aliasing/lifetime of the pointed-to object. or
  2. require memoryssa everywhere and only allow pointer variables to be propagated when memory state they depend on is the same or
  3. introduce a pointercmp instruction.

I'd prefer all of these over trying to special case and keep up to date every place in the compiler that gleans info on icmps such that they all special case pointer icmps.

But i'm also not going to fix any of these "bugs", so i'll defer to whoever tries to fix it, as long as they do it in a way that doesn't cause massive performance regression relative to GCC (which, again, happily propagates equivalences).

The latter intotptr problem (and the original testcase on this bug) is impossible to solve "in full generality" except by dropping literally all non-floating equivalences gleaned from comparisons. I can split up any pointer into as many pieces, equivalence them in ways we break right now, and then reassemble the pointer.

Given how uncommon it is, i'm still of the belief that a well lit path for people to follow who want to do it.

At least in the initial performance testing i did, dropping equivalences except for those involving constants was a significant loss. So i'm pretty sure "meet halfway" wouldn't work well here.

(But i welcome other testing).

llvmbot commented 7 years ago

The issue with invariant.group is that by replacing one pointer with another we skip the invariant.group and reference the dead pointer. By fix I meant to not perform replacement based on the comparision if the operands are not constant.

dropping invariant.groups would be also valid after such optimization, but the problem here is that you can't really do that across different calls

if (a == b) call(b); // this might have invariant.group loads inside.

This is not Alias Analysis issue, aliasing does not do anything here - we do not replace pointer with another based on aliasing info.

llvmbot commented 7 years ago

A bunch of things worth mentioning:

As Piotr says, you can get GVN to choose the other pointer just by switching the preference it has: diff --git a/lib/Transforms/Scalar/GVN.cpp b/lib/Transforms/Scalar/GVN.cpp index 593aad74bd1..8bd0ba2756a 100644 --- a/lib/Transforms/Scalar/GVN.cpp +++ b/lib/Transforms/Scalar/GVN.cpp @@ -1703,7 +1703,7 @@ bool GVN::propagateEquality(Value LHS, Value RHS, const BasicBlockEdge &Root, // Move the 'oldest' value to the right-hand side, using the value number // as a proxy for age. uint32_t RVN = VN.lookupOrAdd(RHS);

  • if (LVN < RVN) {
  • if (LVN >= RVN) { std::swap(LHS, RHS); LVN = RVN; }

As mentioned, piotr's example happens to involve a barrier bug. Certainly you can see that you can generate a near infinite set of examples where it will propagate past implicit lifetimes that affect aliasing. (i believe on another bug, sanjoy has demonstrated large numbers of issues using free)

You mean the example in comment #​40?

A = malloc(); free(A); B = malloc(); B = 0 C = select (ptrtoint(A) == ptrtoint(B)), B, A ;; S0 r = C free(B);

That program is ill formed -- There is potential use after free error (for referencing A at S0 and beyond.

Making compiler to well-behave in this case is possible -- e.g, by make AA more flow sensitive (to honor the implicit barrier of 'free'), but why would we do that for ill-formed program (even though there is a runtime path that is not ill-formed)?

It's also worth mentioning: The invariant barrier intrinsics are only defined to affect the invariant group info, and stop a pointer from carrying it. You should be able to drop all invariant group info and barriers and get the same issue.

Here, that issue it is directly expressed in the IR, as if you remove the invariant barrier and propagate its argument, it already reuses the same pointer past the end of its lifetime, and nothing stops it.

llvmbot commented 7 years ago

A bunch of things worth mentioning:

As Piotr says, you can get GVN to choose the other pointer just by switching the preference it has: diff --git a/lib/Transforms/Scalar/GVN.cpp b/lib/Transforms/Scalar/GVN.cpp index 593aad74bd1..8bd0ba2756a 100644 --- a/lib/Transforms/Scalar/GVN.cpp +++ b/lib/Transforms/Scalar/GVN.cpp @@ -1703,7 +1703,7 @@ bool GVN::propagateEquality(Value LHS, Value RHS, const BasicBlockEdge &Root, // Move the 'oldest' value to the right-hand side, using the value number // as a proxy for age. uint32_t RVN = VN.lookupOrAdd(RHS);

As mentioned, piotr's example happens to involve a barrier bug. Certainly you can see that you can generate a near infinite set of examples where it will propagate past implicit lifetimes that affect aliasing. (i believe on another bug, sanjoy has demonstrated large numbers of issues using free)

It's also worth mentioning: The invariant barrier intrinsics are only defined to affect the invariant group info, and stop a pointer from carrying it. You should be able to drop all invariant group info and barriers and get the same issue.

Here, that issue it is directly expressed in the IR, as if you remove the invariant barrier and propagate its argument, it already reuses the same pointer past the end of its lifetime, and nothing stops it.