Open ThomasHaas opened 1 day ago
One more reason to believe that GEP
and OpAccessChain
should match
I remember that I considered using GEP for access chains but decided against it because of some differences. I need to check on Monday what was the reason.
When lowering
GEPExpr
to integer arithmetic, we first sign-extend the indices toi64
(pointer size), then scale them by some factor (if indexing into an array), and then add the offsets to the base address/pointer. In SPIRV there is anOpAccessChain
instruction that works similarly toGEP
. The SPIRV parser tries to resolve it directly with a different semantics to GEP: it zero-extends rather than sign-extends. So we already have a discrepancy in address computations.This begs the following questions: (1) Zero-extend or sign-extend? Also, do we always extend (or even truncate) to the pointer size? (2) When does the extension happen: before or after scaling? (3) Are
OpAccessChain
andGEP
even the same?Unfortunately, (3) is hard to answer because
OpAccessChain
has barely any documentation, and nothing about how it is actually lowered to integer arithmetic. ForGEP
, we have the LangRef which states:The first sentence answers the second part of (1): the index type, which is the type of the first index of the
GEP
, dictates the type of the address computation. The second and third sentences tell us to do sign-extension rather than zero-extension, and also say that scaling happens after extension. The second paragraph explains some strange semantics when combining large pointers with small addresses: the upper bits of the pointer remain unaffected. This is likely related to some composite pointers where the lower bits give the actual address and the upper bits some meta information (e.g., tagging, provenance, address space) that shall never be affected. I think it is safe to ignore for now.Are we doing it right? No. We always scale GEPs to
i64
rather than to the first index. So far, this has not raised issues. However, @hernanponcedeleon showed me an example of SPIRV code that does indeed have issues:The original code looked like
array[x - 1]
, i.e., accessing an array ofuints
. The-1
results in offset-4
(sizeof(uint) == 4
) which expressed as uint is4 * 1073741823
, i.e., effectively relying on overflowing behaviour. The parser, however, generates4 * zext 1073741823 to i64
which is not equivalent to-4
anymore, resulting in an out-of-bounds access.My first intuition was that
zext (4 * 1073741823) to i64
is the correct code, but that disagrees with LLVM's GEP which scales AFTER extension. The more correct argument is that we should convert the type to the first index type (here given by%uint_0
, i.e.i32
), perform the address offset computation (both scaling and adding offsets), and only at the very end extend the calculated offset toi64
for adding it to the base. This interpretation would give the correct result while retaining the idea thatOpAccessChain == GEP
holds.TLDR:
GEP
andOpAccessChain
are wrong. For CPU code we have not run into issues but GPU code exposes them. Correct computation:GEP
andOpAccessChain
, assuming their semantics are identical (or find a source that tells otherwise).