rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
99.08k stars 12.79k forks source link

Does LLVM's quirk of counting in bits matter to us? #130683

Open workingjubilee opened 2 months ago

workingjubilee commented 2 months ago

Location

In a lot of different places we document an isize::MAX limit on object sizes, for example, see this: https://doc.rust-lang.org/std/slice/fn.from_raw_parts.html

Summary

So, hypothetically, we "only" can have an object countable by LLVM within 64 bits: https://github.com/rust-lang/rust/blob/55043f067dcf7067e7c6ebccf3639af94ff57bda/compiler/rustc_abi/src/lib.rs#L340-L350

However, hypothetically, we "should" be able to use an object that is isize::MAX bytes. These two facts contradict each other, since that suggests we should be able to have an object that is greater than 2^61 bytes. It is hard to validate whether or not LLVM miscompiles this, though, as most computers do not deal in greater than around 56 bits of address space.

Anyone working in HPC is encouraged to let us know?

Honestly, it's not even clear what we mean by "object" here. Allocation, probably?

hanna-kruppe commented 2 months ago

As far as I know, it's only llvm::TypeSize and related code that counts in bits. That's always about static types described in the IR, which matter mostly for the sizes of global variables, allocas, etc. but not for runtime behavior. If you exceed this limit then LLVM will have integer overflows at compile time, which is why obj_size_bound is always compared against Rust type layouts. The isize::MAX limit is about the address arithmetic done by getelementptr (GEP) and the newer ptradd, which always calculate in units of bytes. So that's the relevant limit for when you allocate a big slice and index it.

Since GEP interacts with type size in that it scales the index by the size of a type, you could cause a GEP "miscompilation" by having a static type larger than 2^64 bits in there. But the type size (compile time constant, modulo scalable vectors) gets converted to bytes before runtime and for the runtime semantics, GEP / ptradd is defined only in terms of bytes, so any analysis / optimization / codegen decision is only allowed to rely on the byte-based calculations not wrapping around.

workingjubilee commented 2 months ago

@hanna-kruppe Thank you for the explanation! So if I'm interpreting that correctly, we can't realistically have a struct that requires 63 bits of size to encode... not without doing sufficient work to fix our codegen so that LLVM never actually sees the struct definition... but we can have a slice of an absurdly large type that, for each element, does not exceed, but does reach, the bit-counted bound? At least two, anyways, which exceeds the bits bound. Good to know!

hanna-kruppe commented 2 months ago

It's not just structs strictly speaking. Something like static X: [u8; 1 << 62] runs into the same problems if you want to communicate its size to LLVM, as does an equivalent const if it makes it into the final IR. So your statement about slices is correct if it truly only means [T] but extending it to arrays would be incorrect. You also can't have an alloca that big, and unlike globals you can't work around that by telling LLVM about an externally defined symbol of indeterminate size.

The issue with GEP index scaling being based off a type would likely be resolved by the migration to ptradd. The details of that were not completely finalized last I checked but it would replace getelementptr [i8 x HUGE], ptr %p, ... (where calculating the size of the array can overflow TypeSize before it's converted to bytes) with 100% byte-based arithmetic. Of course, you can already work around this today by manually doing the index scaling and feeding it into getelementptr i8, ptr %p, ....