What are the special magic rules around `malloc`?

rust-lang / unsafe-code-guidelines

Forum for discussion about what unsafe code can and can't do

https://rust-lang.github.io/unsafe-code-guidelines

Apache License 2.0

667 stars 58 forks source link

What are the special magic rules around `malloc`? #535

Open RalfJung opened 1 month ago

RalfJung commented 1 month ago

Taken from https://github.com/rust-lang/unsafe-code-guidelines/issues/534:

// use a mutable reference to prevent the MIR opt from happening
#[no_mangle]
pub fn src(x: &mut &u8) -> impl Sized {
    let y = **x;
    let mut z = Box::new(0);
    // a bunch of code that operates on the `Box`, however, 
    // nothing else can potentially access the underlying `u8`
    // that's behind the double reference besides the `__rust_alloc` call.

    // optimizable to `true`?
    **x == y
}

Currently, LLVM doesn't do the second optimization. However, it does perform it if you manually set System to be the global allocator: https://rust.godbolt.org/z/a77PWjeKE [^1]. This is due to this line, which is used by their GVN pass.

There are clearly special magic rules applying specifically for malloc that mean that its memory must be truly fresh for the Abstract Machine, and cannot be part of any previously existing stack/heap/other allocation. This is "fine" as long as malloc is called via FFI and all the state it works in is completely hidden from the current compilation unit. It becomes rather incoherent if there is ever a chance of malloc itself being inlined into surrounding code, or exchanging data with surrounding code via global state -- so we better have rules in place against things like that. I think we should say that malloc is reserved to be provided by the underlying runtime system, and it must be called via FFI in a way that no inlining is possible.

Note that this is separate from Rust's #[global_allocator] attribute, which does not get all the same magic that malloc gets. See https://github.com/rust-lang/unsafe-code-guidelines/issues/442 for discussion of the semantics of that attribute.

[^1]: You also get the malloc -> calloc transformation for types other than these hardcoded ones if you set System to be the global allocator manually.

VorpalBlade commented 1 month ago

The issue with this magic that I see is if you implement malloc itself in Rust.

If it is in a completely different cdylib/cststiclib that is probably still fine(?)
I'm not sure what happens if you implement a libc that both provides malloc and uses the same malloc itself. This is actually required, some functions in libc are documented to return allocations from malloc that should be freed with free. Such as strdup (and many more).
If it is part of the same compilation graph (as is usually the case for embedded for example) you might run into issues(?).

Another issue is LTO or even cross-language LTO.

RalfJung commented 1 month ago

I agree that this magic is potentially problematic. I don't know if LLVM has a way to disable it though.

VorpalBlade commented 1 month ago

I agree that this magic is potentially problematic. I don't know if LLVM has a way to disable it though.

Fair enough. But I do believe rust / llvm need an answer for how to properly handle the above scenarios. How do I do these things soundly in Rust? Can I or can I not use LTO when making a libc for example?

Also, as I understand it, any soundness issues that cannot be traced to an unsafe block (or unsafe attribute, unsafe command line flags (though I don't think those exist yet?), etc) are compiler bugs? Though in this case I guess the unsafe bit is the no-mangle export of a function called malloc, but that feels like a cop-out and would make it really difficult to write a libc in Rust.

Diggsey commented 1 month ago

There are clearly special magic rules applying specifically for malloc that mean that its memory must be truly fresh for the Abstract Machine, and cannot be part of any previously existing stack/heap/other allocation.

Could I dig a bit more into why this is important? Could we avoid such issues by having the malloc implementation explicitly "carve out" an existing allocation and give it back to the Abstract Machine, minting a new allocation? I imagine this "carving out" would come with significant limitations, such as no access being allowed to that region of memory until it is returned.

In this model, the malloc implementation accessing the memory after carving it out would be UB.

RalfJung commented 1 month ago

why this is important?

It's important because LLVM does optimizations and we have to ensure they don't break our model.

This is a descriptivist issue, not a prescriptivist one. There are special magic rules for malloc on LLVM. I don't know the full extent of this magic, it is AFAIK not documented. I can't tell you how important they are, you'll have to ask that on the LLVM forums. I also don't know whether a libc written in C needs to do anything special wrt its malloc symbol to avoid trouble here.

In this model, the malloc implementation accessing the memory after carving it out would be UB.

That's already part of the model for regular Rust global allocators. This issue is about malloc magic that goes beyond this. My understanding is that LLVM does more things for malloc than it does for our __rust_alloc, and I linked to an example of that in the issue description. In particular, what you describe is nowhere near enough to justify the optimization in the issue description.

Diggsey commented 1 month ago

It's important because LLVM does optimizations and we have to ensure they don't break our model. This is a descriptivist issue, not a prescriptivist one.

Right, I'm trying to understand what part of the model these optimizations would break if malloc were implemented in the same compilation unit.

In particular, what you describe is nowhere near enough to justify the optimization in the issue description.

I think I was misunderstanding the nature of this issue, due to:

There are clearly special magic rules applying specifically for malloc that mean that its memory must be truly fresh for the Abstract Machine, and cannot be part of any previously existing stack/heap/other allocation.

I thought that meant this was something to do with LLVM's intrinsic understanding of memory allocation and deallocation, but actually it could happen with any function that LLVM "knows" doesn't access any IR visible value. LLVM has chosen to only special-case malloc and related functions because they are already "known" and are presumably the most common example of such a function?

Do we know when LLVM considers something to be malloc? Maybe it already handles the case we are worried about (where malloc is implemented in the same compilation unit) by not considering it to be a malloc-like function in that case?

steveklabnik commented 1 month ago

I am not an expert on this just yet, but here is what I do know about LLVM and malloc.

The C standard specifies malloc and friends in section 7.22.3.
That section defines a number of behaviors for this family of functions, that is, they aren't just plain old C functions, but instead work kind of like language built ins.
Older LLVM versions had an explicit malloc instruction: https://releases.llvm.org/1.1/docs/LangRef.html#i_malloc
Newer ones instead use annotations https://llvm.org/docs/LangRef.html#function-attributes

So for example, https://godbolt.org/z/hdPYGf73v

; Function Attrs: nounwind allocsize(0)
declare noalias ptr @malloc(i64 noundef) #1

This is adding the allocsize attribute, and so LLVM knows that it has these semantics.

I would expect any optimizations LLVM does to be in accordance with 7.22.3 of the C spec, but of course, it's not like any set of optimizations is perfect.

That's about where my understanding ends though, for example, I am curious as to why this doesn't have the allockind("alloc") annotation on it.

CAD97 commented 1 month ago

[The standard] defines a number of behaviors for this family of functions, that is, they aren't just plain old C functions, but instead work kind of like language built ins.

AIUI, all functions defined by the C and C++ standards work like this. In practice a number of items are provided by a normal looking implementation, but the standard wording is careful to allow non-indirected calls to utilize a different INVOKE mechanism than typical user declared functions.

In C, this is primarily noticeable in that standard functions may be provided as function macros as long as behavior is not impacted and the macro can be suppressed to get a nonmacro implementation.

Newer [LLVM] instead use annotations

I believe LLVM will still recognize an unmangled malloc symbol with the correct signature for being malloc and optimize it as such. The same goes for any other libc symbol which has compiler knowledge of its behavior, although if behavior isn't exactly specified (e.g. the float math functions) then replacing library calls with built-in functionality is usually gated by a flag.

This is needed for optimizing header-declared versions of the libc functions, which is explicitly allowed by the C standard (at least before C23 which might have changed some things there; something changed about function addresses IIRC, at least for C++’s versions, if not C itself).

for example, I am curious as to why this doesn't have the allockind("alloc") annotation on it.

AIUI, LLVM treats the absence of a specified allockind as being compatible with the alloc family for malloc.

RalfJung commented 2 weeks ago

Right, I'm trying to understand what part of the model these optimizations would break if malloc were implemented in the same compilation unit.

As long as it still doesn't read or write any state accessed by outside code, that should be fine.

I believe LLVM will still recognize an unmangled malloc symbol with the correct signature for being malloc and optimize it as such.

LLVM knows two kinds of allocation functions: those marked with the AllocKindFlags::Alloc attribute (that's what our LLVM API calls it anyway), which is what we set of #[global_allocator]. For a discussion of their semantics, see https://github.com/rust-lang/unsafe-code-guidelines/issues/442. But then there's extra special magic specifically for malloc that to my knowledge cannot be triggered with an attribute, and that's what this issue is about.