rust-lang / miri

An interpreter for Rust's mid-level intermediate representation
Apache License 2.0
4.14k stars 318 forks source link

Big `Vec::try_reserve` OOMs Miri (slowly) #3637

Closed DaniPopes closed 1 month ago

DaniPopes commented 1 month ago

On both GitHub CI default linux runner (should have 16GB RAM), and locally (also linux, 64GB RAM), the following test will run out of memory in Miri, but panic in normal cargo test:

#[test]
fn try_allocate_a_lot() {
    let mut v = Vec::<u8>::new();
    v.try_reserve(128 * 1024 * 1024 * 1024).unwrap();
    let _ = v;
}

Apologies if this has been reported before, I could not find another issue for this.

bjorn3 commented 1 month ago

CTRL+C does not stop it

Does a second ctrl-c stop it? Also since https://github.com/rust-lang/rust/pull/125523 it should exit at most 100ms after a ctrl-c.

RalfJung commented 1 month ago

Yeah this is pretty much known, see https://github.com/rust-lang/miri/issues/613. We don't make any attempt at recovering from allocation failure.

RalfJung commented 1 month ago

However, what I would expect to happen is that Miri itself aborts due to OOM. Maybe the reason this does not happen is that Miri uses jemalloc under the hood, whereas cargo test will use the system allocator. So maybe malloc fails immediately on an allocation of that size but jemalloc tries anyway, and then your system grinds to a halt due to swapping.

RalfJung commented 1 month ago

FWIW when I try to run that function locally, the program gets swiftly killed. So I can't reproduce the hang you seem to be seeing.

DaniPopes commented 1 month ago

Does a second ctrl-c stop it? Also since https://github.com/rust-lang/rust/pull/125523 it should exit at most 100ms after a ctrl-c.

FWIW when I try to run that function locally, the program gets swiftly killed. So I can't reproduce the hang you seem to be seeing.

I was on 1.80.0-nightly (1ba35e9bb 2024-05-25), CTRL+C would exit cargo but miri would still be in the background allocating, forcing me to kill it manually. I just updated to rustc 1.80.0-nightly (bdbbb6c6a 2024-05-26) and now it does exit fully on CTRL+C.

saethlin commented 1 month ago

This example program tries to allocate 128 GB which is a totally plausible amount of memory to have. For example, it's the amount of physical memory my desktop has. I suggest that if you want to write a test for OOMs you use a number that's a few factors of 2 higher.

This doesn't immediately abort the interpreter because jemalloc sets the flag MAP_NORESERVE. The interpreter looks like it is stuck, because every allocation in the interpreter is a Box<[u8]> which is the actual bytes for the data in the allocation, and the initialization state is tracked explicitly alongside. The bytes are always initialized, and the runtime you're seeing is almost certainly all the page faults from zeroing all the memory. Which yes is silly because mmap returns zeroed memory and we could totally use MaybeUninit here for efficiency, but the risk of getting it wrong seems sketchy. We try rather hard to avoid unsafe code in the interpreter.

saethlin commented 1 month ago

Note that on my desktop, the threshold for an allocation failing with jemalloc is somewhere between 64 and 128 TB

use jemallocator::Jemalloc;
#[global_allocator]
static GLOBAL: Jemalloc = Jemalloc;

fn main() {
    let mut v = Vec::<u8>::new();
    v.try_reserve(128 * 1024 * 1024 * 1024 * 1024).unwrap();
    let _ = v;
}
bjorn3 commented 1 month ago

Trying to allocate an uninit Allocation<_, _, Bytes> using Allocation::uninit() calls Bytes::zeroed which for MiriBytes uses alloc::alloc_zeroed, which should make the allocator either zero the bytes for us or if it mmap'ed a large allocation directly return the pre-zeroed mmap'ed range.

RalfJung commented 1 month ago

Yeah, we are using efficient zero-init if the allocator supports it.

saethlin commented 1 month ago

The page faults are coming from <miri::machine::MiriMachine as rustc_const_eval::interpret::machine::Machine>::adjust_allocation. I think MiriAllocBytes is copying the allocation's data, I'm just not sure where exactly. I'm fighting with perf and bootstrap to find the right flags across Miri and the compiler that will let me profile this.

saethlin commented 1 month ago

I still haven't made bootstrap cooperate, but I'm pretty sure this is just the overhead from the fact that Miri's adjust_allocation clones all allocations.

RalfJung commented 1 month ago

Oh right, we always go through adjust_allocation... that's kind of silly.