llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.53k stars 11.79k forks source link

Redundant pointer comparisons after arithmetic not deleted #78784

Open davidben opened 9 months ago

davidben commented 9 months ago

I noticed this as I was exploring the compiler output for some of our (Chromium's) checked iterators. To be honest, I'm not sure this check makes sense, since we should just assume the container is self-consistent. But, nonetheless, I think Clang could have deleted this check, so I'm filing this as a missed optimization.

Reproducer: https://godbolt.org/z/EzoWqGEEK

This code ends up performing a few variations on checking that data_ <= data_ + size_, where size_ is a size_t. Of course, the real reason this is always valid is that [data_, data_ + size_) describes a span, but the compiler doesn't a priori know that. However, I believe it can infer this because:

  1. size_ is unsigned, so we're not adding a negative number to data_
  2. Pointer arithmetic is required to stay in bounds, and is not defined to cast to ptrdiff_t first: https://eel.is/c++draft/expr.add#4.2
  3. Therefore, the addition cannot have overflowed and data_ <= data_ + size_ is always true in all defined cases
davidben commented 9 months ago

It's a little interesting that compiler output is checking the sign of size_, and not whether data_ + size_ overflowed. I'm guessing the issue is not overflow, per se, but that data_ + size_ gets lowered to data_ + (ptrdiff_t)size_, and we're checking if (ptrdiff_t)size_ is negative?

If I'm reading the spec right, I think pointer arithmetic is in terms of the numerical value of the offset, without coercing into any particular common type, which means we can safely assume that (ptrdiff_t)size_ is positive here. But it is a little more subtle.

zmodem commented 8 months ago

Therefore, the addition cannot have overflowed and data <= data + size_ is always true in all defined cases

IIUC, this boils down to: (https://godbolt.org/z/TvoKvoGsd)

bool f(int *p, size_t s) {
  return p <= p + s;
}

I'm guessing the issue is not overflow, per se, but that data + size gets lowered to data_ + (ptrdifft)size, and we're checking if (ptrdifft)size is negative?

Yes, I think it's something like this.

On the LLVM level, the pointer arithmetic gets lowered to a getelementptr inbounds instruction. That treats the offset as a signed value, although in this case it isn't (I think?).

If s were a uint32_t instead, LLVM would zero-extend that to 64 bits, so the optimizer sees that it can't be negative. But in the original code, it seems that information is lost.

Probably an optimization expert could explain better what's going on. Maybe @aeubanks can take a look?

aeubanks commented 8 months ago

Yeah that's pretty much it. LLVM doesn't treat integer values as unsigned/signed, but rather operations that work with ints as allowing/prohibiting unsigned/signed overflow. But since getelementptr currently can't represent any of that, we lose information lowering to IR.

bool f(int *p, unsigned long s) {
  // __builtin_assume((long)s >= 0);
  return p <= p + s;
}

Without the __builtin_assume() we're missing the optimization. With the __builtin_assume() we get the optimization we expect, but the IR contains calls to llvm.assume() which aren't dropped until very late and can block optimizations.

If we think this is important enough, then perhaps we should extend LLVM IR to be able to represent this in getelementptr without having to resort to a big hammer like llvm.assume(). Perhaps similar to https://discourse.llvm.org/t/rfc-add-zext-nneg-flag/73914, add a nneg flag to getelementptr, to indicate that a getelementptr only works with non-negative indices?

@fhahn @nikic

nikic commented 8 months ago

Having nusw and nneg flags on getelementptr are on my long-term wishlist :)

davidben commented 8 months ago

If we think this is important enough

I definitely don't know enough about this to say anything concrete here. It seems to be plausibly relevant for the other missed optimizations I found after this one. (I've been spending a lot of time exploring Chromium's and libc++'s bounded iterators. If we could get to the point that iterators are broadly bounds-checked, but that all redundant checks are reliably deleted, that would be a great safety win.)

fhahn commented 8 months ago

For context, this issue was also discussed here: https://discourse.llvm.org/t/retain-fact-that-index-offsets-for-gep-are-non-negative-for-size-t-uint64-t-offsets/65974

Might be time to revive this now