rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.16k stars 12.69k forks source link

Tracking issue for the OOM hook (`alloc_error_hook`) #51245

Open glandium opened 6 years ago

glandium commented 6 years ago

PR #50880 added an API to override the std OOM handler, similarly to the panic hook. This was discussed previously in issue #49668, after PR #50144 moved OOM handling out of the Alloc/GlobalAlloc traits. The API is somewhat similar to what existed before PR #42727 removed it without an explanation. This issue tracks the stabilization of this API.

Defined in the std::alloc module:

pub fn set_oom_hook(hook: fn(Layout) -> !);
pub fn take_oom_hook() -> fn(Layout) -> !;

pub fn set_alloc_error_hook(hook: fn(Layout));
pub fn take_alloc_error_hook() -> fn(Layout);

CC @rust-lang/libs, @SimonSapin

Unresolved questions

SimonSapin commented 6 years ago

We have an accepted RFC https://github.com/rust-lang/rust/issues/48043 that includes adding a way to opt into of panicking instead of aborting on OOM (so that catch_unwind could be used to recover from OOM at coarse granularity). Perhaps this could be the mechanism for that?

glandium commented 6 years ago

Oh right, I had forgotten about that. And in fact, it works:

#![feature(allocator_api)]

use std::alloc::set_oom_hook;
use std::alloc::Layout;

fn oom(layout: Layout) -> ! {
    panic!("oom {}", layout.size());
}

fn main() {
    set_oom_hook(oom);
    let result = std::panic::catch_unwind(|| {
        let v = Vec::<u8>::with_capacity(100000000000000);
        println!("{:p}", &v[..]);
    });
    println!("{:?}", result);
}

cargo run with last nightly:

thread 'main' panicked at 'oom 100000000000000', src/main.rs:7:5
note: Run with `RUST_BACKTRACE=1` for a backtrace.
Err(Any)
glandium commented 6 years ago

(despite the #[rustc_allocator_nounwind] annotation on alloc::alloc::oom)

glandium commented 6 years ago

Added item about interaction with unwinding.

nikic commented 6 years ago

@glandium While this example works, the #[rustc_allocator_nounwind] annotation will (likely) prevent some drops from running during unwinding (in particular those that would have to be run inside the function that performs the oom call).

glandium commented 6 years ago

Picture me puzzled.

#![feature(allocator_api)]

use std::alloc::set_oom_hook;
use std::alloc::Layout;

struct Foo;

impl Drop for Foo {
    fn drop(&mut self) {
        println!("Foo");
    }
}

fn oom(layout: Layout) -> ! {
    let f = Foo;
    panic!("oom {}", layout.size());
}

fn main() {
    set_oom_hook(oom);
    let result = std::panic::catch_unwind(|| {
        let f = Foo;
        let v = Vec::<u8>::with_capacity(100000000000000);
        println!("{:p}", &v[..]);
    });
    println!("{:?}", result);
}
thread 'main' panicked at 'oom 100000000000000', src/main.rs:16:5
note: Run with `RUST_BACKTRACE=1` for a backtrace.
Foo
Foo
Err(Any)
glandium commented 6 years ago

Or

#![feature(allocator_api)]

use std::alloc::set_oom_hook;
use std::alloc::Layout;

struct Foo;

impl Drop for Foo {
    fn drop(&mut self) {
        println!("Foo");
    }
}

fn oom(layout: Layout) -> ! {
    let f = Foo;
    panic!("oom {}", layout.size());
}

fn main() {
    set_oom_hook(oom);
    let result = std::panic::catch_unwind(|| {
        let f = Foo;
        std::alloc::oom(Layout::new::<u8>());
        println!("after oom");
    });
    println!("{:?}", result);
}
thread 'main' panicked at 'oom 1', src/main.rs:16:5
note: Run with `RUST_BACKTRACE=1` for a backtrace.
Foo
Foo
Err(Any)
glandium commented 6 years ago

Ah, I forgot --release, and with it, in both cases, the Foo instance in the closure is not dropped (the one in oom is, though).

alexcrichton commented 6 years ago

With panic hooks they notably return () instead of ! which allows you to chain them (if necessary). I think if we want to use the terminology "hook" for OOM we probably want the same (not returning !) as that'll allow us to also justify take_oom_hook as otherwise I think you can't actually call that and/or delegate in an order where you don't go first?

OOM hooks could always panic/abort themselves of course but the signature may just want to be that by default we don't require it and then the fallback of OOM is to abort as OOM does today

glandium commented 6 years ago

take_oom_hook is still useful albeit annoying, since hooks can't be closures. So you'd have to store the value in a global, but then you can call it from your own function. I can see how returning () would make things more flexible, though.

SimonSapin commented 6 years ago

Why does take_ unregister the hook, as well as returning it? More generally, what’s an example where this function is useful?

glandium commented 6 years ago

This matches panic hooks, and while not exactly simple (because you can't use a closure as hook), it can be used to do whatever your hook does, while still printing the default message from libstd, whatever it is, without having to care about doing it properly yourself (which the current implementation doesn't do, since it can end up allocating aiui, which rather makes the point: one shouldn't try to replicate the code of the default handler)

glandium commented 6 years ago

Or, if another hook was already set, it allows to fall back to that one.

SimonSapin commented 6 years ago

On accessing the current/default hook, sure. But why unregistering it in the process? fn(…) pointer types are Copy, unlike Box<Fn(…)>.

glandium commented 6 years ago

Would you rather add a function to unregister the current hook and break the similarity with the panic hook?

sfackler commented 6 years ago

Why is the oom hook a function rather than a closure?

glandium commented 6 years ago

Because a closure can't be stored in an Atomic afaict, and using an Atomic is necessary because RwLock or Mutex can be lazily initialized and allocate memory on some platforms. You don't want to be allocating memory when trying to retrieve the hook while you're handling an oom.

sfackler commented 6 years ago

Seems like we can avoid that problem with a flag tracking if set_hook has ever been called. If it hasn't, we just call the default hook directly and avoid touching the mutex.

glandium commented 6 years ago

Well, another (new) reason is that we eventually want to move this to liballoc, which doesn't have access to any kind of locking primitive.

gnzlbg commented 6 years ago

@glandium

Because a closure can't be stored in an Atomic afaict,

It can (playground):

#![feature(atomic_integers, const_fn)]
use std::sync::atomic::AtomicUsize;
use std::sync::atomic::Ordering;

const unsafe fn transmute(x: *mut u8) -> usize {
    union T {
        a: *mut u8,
        b: usize
    }
    T { a: x }.b
}

const BAR: *mut u8 = ((|| 3) as fn() -> i32) as *mut u8;
const FOO: AtomicUsize = AtomicUsize::new(unsafe { transmute(BAR) });
// static FOO: AtomicUsize = AtomicUsize::new(unsafe { transmute(BAR) }); // ALSO OK

fn main() {
    let l = FOO.load(Ordering::Relaxed);
    let l: fn() -> i32 = unsafe { std::mem::transmute(l) };
    assert_eq!(l(), 3);
}
SimonSapin commented 6 years ago

Only closures that do not capture anything can be cast to function pointers, not arbitrary F: Fn(…) values.

SimonSapin commented 6 years ago

https://github.com/rust-lang/rust/pull/51543 propose renaming the functions to set_alloc_error_hook and take_alloc_error_hook.

gnzlbg commented 6 years ago

Only closures that do not capture anything can be cast to function pointers, not arbitrary F: Fn(…) values.

Of course. I thought we were only talking about closures without any environment, since I don't see how panic hooks with state make much sense (maybe someone can elaborate on that).

If you need to set a closure with an environment as a panic hook you just have to implement a struct representing the environment, implement the fn traits on that, and put the struct in a static in a properly synchronized way. Depending on how big the environment is, you might get away with a single extra atomic, or you might need a Mutex+Arc combo for it. Using a Mutex inside a panic hook sounds like a pretty bad idea to me though.

sfackler commented 6 years ago

@gnzlbg the simple case is grabbing the old hook and wrapping it with your own.

landaire commented 4 years ago

Just wanted to provide an update to @glandium's sample code that works as of March 2020. Some of the PRs changed the feature name this was gated behind and renamed the function to set the error hook.

#![feature(alloc_error_hook)]

use std::alloc::set_alloc_error_hook;
use std::alloc::Layout;

fn oom(layout: Layout) {
    panic!("oom {}", layout.size());
}

fn main() {
    set_alloc_error_hook(oom);
}
mitsuhiko commented 4 years ago

About the open point 2 "Interaction with unwinding". It would be really nice if this could allow the hook to panic. Reason being that you can then catch such a panic at the C boundary. This would help cases like this: https://github.com/hyperium/hyper/issues/2265

seanmonstar commented 4 years ago

Besides the interaction with unwinding, which I would like to be able to rely on, I'm also interested in the timeline of stabilizing this. It looks fairly quiet, if we answer the unwinding question, is there anything else preventing stabilizing soon?

kornelski commented 3 years ago

I think set should always return the previous hook it replaced (Option<fn>). Having set and take as separate functions makes "patching" an existing hook a technically unreliable operation, because someone else could register a hook between a call to take and set.

glandium commented 3 years ago

Note this was modeled on the panic hook API https://doc.rust-lang.org/beta/std/panic/fn.set_hook.html https://doc.rust-lang.org/beta/std/panic/fn.take_hook.html

kornelski commented 3 years ago

The oom=panic #43596 issue lists this handler as its dependency, but as far as I understand currently the handler forbids unwinding. So it's not fit for purpose of oom=panic.

Is this implementation going to change to allow unwinding, or is oom=panic going to have to add its own handler using some other way?

glandium commented 3 years ago

The implementation itself doesn't prevent unwinding. What does is pre-existing code that marks oom functions as rustc_allocator_nounwind. The question would be whether they can be removed.

Monadic-Cat commented 2 years ago

Now that #88098 has been merged, what's the status of this issue?

glandium commented 2 years ago

There's one unresolved question:

Should this move to its own module, or stay in std::alloc?

Badel2 commented 1 year ago

Hi, I would love to see this feature stabilized. My use-case is handling out-of-memory errors in WebAssembly as panics to avoid aborting the process, as that results in an unhelpful "unreachable executed" message. As far as I know that cannot be achieved using stable Rust, as even the oom=panic flag needs nightly.

However I do have one concern. This API was modeled after std::panic::set_hook, which is a reasonable choice. However that API has some problems when multiple threads want to set a hook at the same time. The main use case seems to be some libraries trying to silence panic messages in their initialization code. I implemented std::panic::update_hook to try to mitigate that issue, but it does not seem ready yet: https://github.com/rust-lang/rust/issues/92649

So I am wondering if there is some easy way to avoid that problem here before the API is stabilized. I don't know if the use case of a library trying to temporarily set a different oom-handler is common enough to worry about it, or if it will even exist. The simplest solution I can think of is to make set_alloc_error_hook return the old hook. This will allow users to access an atomic swap primitive, which can then be extended into more complex use-cases in external libraries. Let me know what you think.

Amanieu commented 1 year ago

In https://github.com/rust-lang/rust/issues/51540#issuecomment-1448404145 I argue for the removal of special handling for OOMs in favor of just treating them as a panic. Comment copied below.


After working on the OOM handler for a while, I think that the best way to move forward is to just treat OOM as a normal panic (so that it calls the normal panic handler/hooks). This is what already happens on #![no_std] since https://github.com/rust-lang/rust/pull/102318 was merged.

I believe that we should do the same for the std case. Specifically:

Amanieu commented 1 year ago

FYI #51540 is under FCP for closing, which will also close this tracking issue. Please comment in that thread if you have any objections.

SimonSapin commented 10 months ago

This tracking issue was closed when PR https://github.com/rust-lang/rust/pull/109507 merged to remove the feature, but that was reverted in PR https://github.com/rust-lang/rust/pull/110782. Reopening to reflect that.

PR https://github.com/rust-lang/rust/pull/112331 is open to remove the feature again, and is already marked as closing this.