Tracking issue for `handle_alloc_error` defaulting to panic (for no_std + liballoc)

SimonSapin commented 4 years ago

The proposal below was implemented in https://github.com/rust-lang/rust/pull/76448, feature-gated under #![feature(default_alloc_error_handler)].

Issues to resolve before stabilization:

[ ] Document the new behavior
[ ] Get some usage experience beyond a synthetic test case https://github.com/rust-lang/rust/issues/66741#issuecomment-706069713
[ ] The current implementation might have UB https://github.com/rust-lang/rust/pull/76448#discussion_r502293959 https://github.com/rust-lang/rust/issues/66741#issuecomment-706079813

Removing that unwind attribute needs to be benchmarked as it could lead to many extra unwind edges

Initial proposal:

Summary

This issue is for getting consensus on a change initially proposed in the tracking issue for #[alloc_error_handler]: https://github.com/rust-lang/rust/issues/51540#issuecomment-553114657

When no #[alloc_error_handler] is defined (which implies that std is not linked, since it literally has such a handler), alloc::alloc::handle_alloc_error should default to calling core::panic! with a message identical to the one that std prints to stderr before aborting in that case.

Although https://github.com/rust-lang/rust/issues/51540#issuecomment-556079840 suggested that a full RFC would not be necessary, this is loosely structured after the RFC template.

Background

See the Background section of the sibling issue proposing stabilization of the attribute.

Motivation

As of Rust 1.36, specifying an allocation error handler is the only requirement for using the alloc crate in no_std environments (i.e. without the std crate being also linked in the program) that cannot be fulfilled by users on the Stable release channel.

Removing this requirement by having a default behavior would allow:

no_std + liballoc applications to start running on the Stable channel
no_std applications that run on Stable to start using liballoc

Guide-level explanation

When std is linked in an application, alloc::alloc::handle_alloc_error defaults to printing an error message to stderr and aborting the process.

When std is not linked and no other #[alloc_error_handler] is defined, handle_alloc_error defaults to panicking as if the following handler were defined:

#[alloc_error_handler]
fn default_handler(layout: core::alloc::Layout) -> ! {
    panic!("memory allocation of {} bytes failed", layout.size())
}

Reference-level explanation

The implementation for this would be very similar to that of #[global_allocator]. (Links in the next two paragraphs go to that implementation.)

alloc::alloc::handle_alloc_error is modified to call an extern "Rust" { fn … } declaration.

The definition of this function does not exist in Rust source code. Instead, it is synthesized by the compiler for “top-level” compilations (executables, cdylibs, etc.) when alloc is in the crate dependency graph. If an #[alloc_error_handler] is defined, the synthesized function calls it. If not, the synthesized function calls alloc::alloc::default_error_handler which is a new lang item. (Or is it?)

In order to allow experimentation for this new default behavior, it should initially be gated behind the #![feature(default_alloc_error_handler)] feature flag. When no handler is defined, a call to the default is (at first) only synthesized if any of the crates in the dependency graph has that feature gate. If none of them do, the current compilation error continues to be emitted.

Alternatives

The status quo is that no_std + alloc requires Nightly.

Stabilizing #[alloc_error_handler] or some other mechanism for specifying this handler is another way to unlock the no_std + liballoc on Stable use case. This removes the initial motivation for coming up with this default behavior. However perhaps this default is still desirable? In a no_std environment where there is no process to abort, the allocation error handler will likely be very similar to the panic handler (which is already mandatory).

SimonSapin commented 4 years ago

Proposing FCP for deciding to adopt this approach:

@rfcbot fcp merge

rfcbot commented 4 years ago

Team member @SimonSapin has proposed to merge this. The next step is review by the rest of the tagged team members:

[x] @Amanieu
[x] @Centril
[x] @Kimundi
[x] @KodrAus
[x] @SimonSapin
[ ] @alexcrichton
[x] @cramertj
[x] @dtolnay
[x] @eddyb
[x] @joshtriplett
[x] @nikomatsakis
[ ] @pnkfelix
[x] @scottmcm
[x] @sfackler
[x] @withoutboats

Concerns:

~~what if stable attribute~~ resolved by https://github.com/rust-lang/rust/issues/66741#issuecomment-579280459

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

SimonSapin commented 4 years ago

Stabilizing #[alloc_error_handler] or some other mechanism for specifying this handler is another way to unlock the no_std + liballoc on Stable use case. This removes the initial motivation for coming up with this default behavior. However perhaps this default is still desirable? In a no_std environment where there is no process to abort, the allocation error handler will likely be very similar to the panic handler (which is already mandatory).

Should we still adopt this default behavior if the #[alloc_error_handler] attribute is to be stabilized soon?

@rfcbot concern what if stable attribute

jplatte commented 4 years ago

What's the way forward here? Personally I would answer this simply with "yes":

Should we still adopt this default behavior if the #[alloc_error_handler] attribute is to be stabilized soon?

SimonSapin commented 4 years ago

I’m ok with that. I’ll make this not a blocking concern for now, but anyone feel free to discuss some more.

@rfcbot resolve what if stable attribute

What's the way forward here?

This proposal still needs at least 6 out of the 8 members of @rust-lang/lang and @rust-lang/libs who haven’t yet to approve it in https://github.com/rust-lang/rust/issues/66741#issuecomment-558184215

alexcrichton commented 4 years ago

I personally do not have a mode of checking off my box here which aligns with what I feel. I do not think this is a good change to make and I personally lament the current state of the alloc crate where I do not believe it should have been stabilized in the first place. I would like to register a blocking objection for this but I do not have the time or the energy to do so. On one hand I would like to not be part of the critical path here so it can proceed without me, but as a member of the libs team I'm not sure that's possible.

Overall I feel that alloc has next-to-no leadership and is simply a result of "let's just ship what's there without thinking about it". This has led to very little documentation about how to use it effectively and fundamentally no actual way to use it ergonomically and effectively.

I don't feel strongly enough about this though to pursue a blocking objection, nor am I really that interested in trying to debate the finer points here. The alloc crate will haunt me no matter what whether I check off my box here or even if we decide to not go with this. Overall I feel stuck and don't know what to do. The best I can say is that I know rfcbot doesn't require full sign-off for entering FCP, so I'm going to not check my box off and assume that when enough of others have checked their box off it will proceed without me.

Lokathor commented 4 years ago

Cheer up Alex! It's not all that bad. While I would agree that there's some obvious bad parts to the alloc crate, I think that this particular change is extremely unlikely to cause any backwards compatibility concerns later on.

SimonSapin commented 4 years ago

@alexcrichton Thanks for writing up your thoughts on this. I hear you on these concerns. What do you think of taking some time at the next Rust All Hands for @rust-lang/libs to discuss the crate organization of the standard library and such high-level design?

Ericson2314 commented 4 years ago

@alexcrichton that was my feeling for many years, but recently I've been pleased and impressed with the work on the alloc-wg (which, to be clear, I've only been a belated and minor contributor to, not trying to complement myself here!)

It looks like this RFC wasn't really done in consultation to that working group? Maybe the libs team could kick this over to them, and whatever their decision they could contextualize it in a more thorough long-term design you are interested in.

My personal opinion is that I don't like this RFC either, but if the allocator parameter stuff the working group has prototyped is merged it will matter a lot less as all no_std code can (and should) return alloc errors explicitly with Result giving the caller maximum flexibility to locally handle the error or punt and let the global handler deal with it. In other words, no_std code shouldn't be using this handler at all so I don't care so much how it works.

SimonSapin commented 4 years ago

My understanding of https://github.com/rust-lang/wg-allocators is that it is not about everything allocation-related, but specifically about making it possible to use a non-global allocator with standard library containers. So the behavior of handle_alloc_error is not in scope for that working group.

giving the caller maximum flexibility to locally handle the error or punt and let the global handler deal with it […] no_std code shouldn't be using this handler at all

Box::new and many other existing APIs do call handle_alloc_error on errors and will keep doing so, including on no_std

Ericson2314 commented 4 years ago

My understanding of

That sounds right to me, but if you and/or the rest of the libs team wants to change the scope of the working group, they can. If @alexcrichton feels stretched thin, maybe that's something he'd want to pursue.

Box::new and many other existing APIs do call handle_alloc_error on errors and will keep doing so, including on no_std

That is right and I cannot change it. It is my opinion one ought to use Box::try_new_in instead. But it's just my opinion, and that hasn't landed yet, and so it remains to be seen whether or not core ecosystem crates will make the switch.

Lokathor commented 4 years ago

Even if you always used Box::try_new and Vec::reserve and all that, you'd still need an allocation error handler defined to link alloc into a no_std binary. So even if we encourage people to never call the allocation handler, we would need either this issue or the attribute issue to land for no_std binaries in Stable. Between the two options, I would urge the teams to accept this proposal because it is the minimal amount to stabilize while also allowing Stable no_std + alloc binaries.

withoutboats commented 4 years ago

My understanding of https://github.com/rust-lang/wg-allocators is that it is not about everything allocation-related, but specifically about making it possible to use a non-global allocator with standard library containers. So the behavior of handle_alloc_error is not in scope for that working group.

Agreed. IMO the alloc crate issues have nothing to do with wg-allocators, but instead to do with our story around no-std and support for diverse platforms that don't support all of std. It would be bizarre to link this issue to wg-allocators.

programmerjake commented 4 years ago

one concern with just panicking is: Isn't it undefined behavior for the allocator functions to unwind? What happens if the panic handler unwinds here?

Lokathor commented 4 years ago

The allocator is never supposed to call handle_alloc_error. it is intended entirely for code that tried an allocation and it failed and the code does not want to deal with the failure so it immediately ends the thread.

programmerjake commented 4 years ago

@Lokathor ah, ok. That makes sense.

joshtriplett commented 4 years ago

Checking Centril's box as he's taking a break from the project.

rfcbot commented 4 years ago

:bell: This is now entering its final comment period, as per the review above. :bell:

RalfJung commented 4 years ago

The definition of this function does not exist in Rust source code. Instead, it is synthesized by the compiler for “top-level” compilations (executables, cdylibs, etc.) when alloc is in the crate dependency graph. If an #[alloc_error_handler] is defined, the synthesized function calls it. If not, the synthesized function calls alloc::alloc::default_error_handler which is a new lang item. (Or is it?)

That is a somewhat odd implementation strategy, and in particular it is not how the "default allocator" works that we use in std applications that do not set their own #[global_allocator]. What is the reason for using totally different implementation strategies for what looks like fairly similar mechanisms?

Synthesized code is much harder to read and maintain, so IMO we should keep it to an absolute minimum.

SimonSapin commented 4 years ago

it is not how the "default allocator" works that we use in std applications that do not set their own #[global_allocator]

Isn’t it? I meant to describe exactly the same mechanism.

RalfJung commented 4 years ago

No it's not. The default allocator is written in normal Rust code, not synthesized.

RalfJung commented 4 years ago

In particular, one advantage of having the default be normal Rust code is that it can also be used by Miri (instead of having to duplicate the logic, like we have to when codegen synthesizes this).

SimonSapin commented 4 years ago

Oh I see. I did not mean that the logic of the default allocator is synthesized or that the logic of the handler would be, only that’s there’s a synthesized indirection somewhere. This is what enable liballoc and libstd to both be compiled before we make the decision whether to use libstd’s default allocator or not.

The alloc::alloc::alloc function calls a __rust_alloc symbol which is declared with extern { fn …; }. That function is synthesized at the LLVM level and does not have Rust source. It does nothing but forward the call to either __rdl_alloc for the default allocator or __rg_alloc for #[global_allocator]. __rdl_alloc has Rust source as you pointed out. __rg_alloc is synthesized at the AST level by #[global_allocator], which is similar to other built-in macros like format_args! or #[derive(Clone)]. It forwards the call to <THE_STATIC_ITEM as GlobalAlloc>::alloc.

Synthesized code is much harder to read and maintain

I agree, and now that I’ve typed out the above I think we can simplify it and get rid of LLVM-level synthesis:

#[global_allocator] would expand to a extern fn definitions (like now) that define __rust_alloc and friends directly, not __rg_alloc
We’d have a perma-unstable crate distributed with libstd that contains #[global_allocator] static A: std::alloc::System = std::alloc::System;
For “top-level compilation” that link libstd but no crate that defines a #[global_allocator], rustc would add this new crate to be linked.

#[alloc_error_handler] could work similarly. (Except the “real” default that panics would in a #[no_std] crate, and libstd would override it to abort.)

SimonSapin commented 4 years ago

I imagine the AST emitted by built-in macros gets compiled to MIR like “normal” Rust source and can be interpreted by miri?

RalfJung commented 4 years ago

The alloc::alloc::alloc function calls a rust_alloc symbol which is declared with extern { fn …; }. That function is synthesized at the LLVM level and does not have Rust source. It does nothing but forward the call to either __rdl_alloc for the default allocator or rg_alloc for #[global_allocator].

Ah, that makes much more sense, thanks. :) If the default OOM hook follows the same pattern, that would work fine for Miri.

I imagine the AST emitted by built-in macros gets compiled to MIR like “normal” Rust source and can be interpreted by miri?

Correct, we don't even see that there was a macro involved.

rfcbot commented 4 years ago

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.

SimonSapin commented 4 years ago

The discussion starting at https://github.com/rust-lang/rust/issues/66740#issuecomment-637067711 was resolved by closing https://github.com/rust-lang/rust/issues/66740. As far as I can tell the next step here is an implementation PR.

https://github.com/rust-lang/rust/issues/66741#issuecomment-618849636 discusses implementation strategies for the default #[global_allocator], that could also be used for a default #[alloc_error_handler]. (Slightly simpler since it’s a single function.)

haraldh commented 4 years ago

Anybody already working on this, or is it worth starting a new PR?

Amanieu commented 4 years ago

I don't think anyone has started working on this yet. You can claim the issue by following these instructions.

haraldh commented 4 years ago

@rustbot claim

haraldh commented 4 years ago

@Amanieu #76448 is my take on this

haraldh commented 3 years ago

Ok, https://github.com/rust-lang/rust/pull/76448 implemented this and referenced this issue as a tracker.

What are the next steps to stabilization?

Amanieu commented 3 years ago

@rust-lang/libs We've already had an FCP for this, but that was before it was implemented. Do we need another FCP for the implementation or can we stabilize it right away?

SimonSapin commented 3 years ago

I feel another FCP is not necessary, but some feedback from a "real" project (larger than a synthetic test case, perhaps embedded running on actual hardware?) trying this out would be good before stabilization.

jamesmunns commented 3 years ago

For a call for embedded impl, it might be good to ping @adamgreig and @therealprof to nominate this for discussion at the next wg meeting, this could be a good focus project.

RalfJung commented 3 years ago

I also think the current implementation might be UB. Removing that unwind attribute needs to be benchmarked as it could lead to many extra unwind edges, I think.

SimonSapin commented 3 years ago

Since the implementation PR points here as the tracking issue I’ve labelled it accordingly and added a before-stabilization checklist in the description.

Unwinding is pretty inherent to this proposal. But yeah it would be nice if we could somehow skip emitting those landing pad in the common case where the std crate is know to be in the dependency graph (since it overrides the default handler with one that does not unwind). Though that would only help if the relevant code paths of the alloc crate are inlined or maybe LTO’ed…

Diggsey commented 3 years ago

It seems a little surprising to me that linking in std would change the behaviour of the OOM hook, I would generally expect std to be a superset of the functionality of core + alloc + etc. If we can unwind on allocation failures with core, can't we unwind on allocation failures with std?

If std is left alone purely for backwards compatibility reasons, can we at least have a plan to bring the defaults back in-line with each other, perhaps in the next edition?

RalfJung commented 3 years ago

@Diggsey it's more like, std happens to implement the alloc-error handler in a way that never panics, so we'd like optimizations to exploit that. Crates using std cannot set their own handler because std already sets one.

Crates not using std can set their own handler, using an untsable feature. If they do not, they get a default. Or is your comment about the fact that that default is different from what std sets the handler too? Yeah I think I can see how that is kind of unexpected. (I first thought you were responding to @SimonSapin so I was confused.)

If std is left alone purely for backwards compatibility reasons, can we at least have a plan to bring the defaults back in-line with each other, perhaps in the next edition?

Editions are a per-crate thing while the handler is global for the crate tree, so editions do not help here.

Lokathor commented 3 years ago

note that in practice you can't unwind in just core, as provided by the Rust project, because there's no unwind mechanism provided in core. You're allowed to write your own, but that's not really the same of course.

RalfJung commented 3 years ago

Oh so the argument for the nounwind is that without std, there cannot be any uwninding, the panic will necessarily halt execution?

Diggsey commented 3 years ago

Or is your comment about the fact that that default is different from what std sets the handler too? Yeah I think I can see how that is kind of unexpected.

Yeah exactly.

Editions are a per-crate thing while the handler is global for the crate tree, so editions do not help here.

I don't think that's a problem, and I could be remembering wrong, but isn't this exactly how we handled changing the default allocator?

Ultimately one crate is the root, and that crate can determine the default handler based on its edition. (With an attribute to override the default that would be added automatically when you upgrade with cargo fix).

RalfJung commented 3 years ago

I am not aware that the default allocator is in any way related to the edition, but maybe I missed when that happened?

Lokathor commented 3 years ago

@RalfJung it would be more accurate to say that without std you don't get an eh_personality implementation provided, which does all the stack unwinding. Users are free to write their own, if they want, but in practice almost everyone just sets "panic = abort" in Cargo.toml and then gets on with things.

So you could unwind without std, and we can't break that, though few choose to.

Amanieu commented 3 years ago

I don't think we gain much from eliminating the OOM unwinding edges considering the allocator itself can already panic.

I would like std to eventually also transition to panicking instead of aborting on OOM since it gives applications a chance to handle OOMs and return an error. This is needed for libraries such as cURL that use Rust components and need to be robust when encountering OOM situations.

RalfJung commented 3 years ago

I don't think we gain much from eliminating the OOM unwinding edges considering the allocator itself can already panic.

The allocator is also marked #[rustc_allocator_nounwind]. Why do you say it can panic?

Amanieu commented 3 years ago

Ah actually it seems that the GlobalAlloc trait doesn't allow unwinding:

It's undefined behavior if global allocators unwind. This restriction may be lifted in the future, but currently a panic from any of these functions may lead to memory unsafety.

Looking through some old threads, it seems that allowing allocators to unwind does indeed cause some code size regressions: #42808

From https://github.com/rust-lang/rust/issues/42808#issuecomment-323839473 it seems that most of the code size regression comes from unwinding edges from dealloc so we could only keep dealloc as nounwind while allowing the other allocator methods to unwind, including the OOM handler.

tmiasko commented 3 years ago

Unwinding on allocation failure is also incompatible with box expressions. They are lowered to _0 = Box(T);, and cannot affect a control flow.

SimonSapin commented 3 years ago

That sounds like a detail of the current implementation rather than a characteristic of the language? And if that’s about the box keyword, it’s not stable so we can change anything about its behavior.

tmiasko commented 3 years ago

The liballoc uses box expression, so either lowering or liballoc should change to ensure it works correctly in the presence of unwinding.

rust-lang / rust