rustwasm / wee_alloc

The Wasm-Enabled, Elfin Allocator
Mozilla Public License 2.0
665 stars 49 forks source link

Unbounded Memory Leak #106

Open CraigMacomber opened 2 years ago

CraigMacomber commented 2 years ago

Describe the Bug

Making two large allocations, then dropping them in the order they were allocated leaks memory in wee_alloc, but not in default allocator.

Steps to Reproduce

Native code

  1. Clone this branch of my app: https://github.com/CraigMacomber/wee_alloc_leak_png
  2. cargo run
  3. GB of memory is consumed per second
  4. Comment out use of wee_alloc global_allocator.
  5. cargo run
  6. Memory use does not increase.

Wasm

  1. Clone this branch of my app: https://github.com/noencke/uuid-cluster/tree/wee_alloc_leak
  2. npm run serve (runs wasm-pack, npm install and webpack dev server)
  3. go to localhost:8080 in browser
  4. click the button (it's the only thing on the page) to increase heap by 131 MB each time

Expected Behavior

Second time allocations are made uses free list.

Actual Behavior

Repeated allocations grow heap infinitely.

Additional Context

This seems similar to an issue mentioned in #105 Heap size does not increase if using default allocator.

Rust source for example is just:

extern crate wee_alloc;

// Use `wee_alloc` as the global allocator.
#[global_allocator]
static ALLOC: wee_alloc::WeeAlloc = wee_alloc::WeeAlloc::INIT;

pub fn leak_test() {
    // This leaks when using wee_alloc
    let a = Box::new([0; 85196]);
    let b = Box::new([0; 80000]);
    drop(a);
    drop(b);
}

fn main() {
    loop {
        leak_test();
    }
}
CraigMacomber commented 2 years ago

I updated the report to show a much more minimal repro.

I have determined that something in the compiler toolchain can elide the allocations that cause this issue when optimizing if I inline everything. Thus my repro uses a reduce optimization level.

The explicit drop calls are needed: otherwise not leak is produced. Dropping in the reverse order still reproduces: it seems that the thing that matters here is the lifetime for both allocations overlap.

The sizes do matter. For example if they are ~10x lower there is no leak. The sizes do not have to be different, or these specific values. For example, this still leaks:

pub fn leak_test() {
    // This leaks when using wee_alloc
    let a = Box::new([0; 80000]);
    let b = Box::new([0; 80000]);
    drop(b);
    drop(a);
}
CraigMacomber commented 2 years ago

After more testing, size 8190 (and less) does not have the issue, and 8191 (and higher) do. This is just below 2^13, so I suspect the issue is when the allocation + some overhead exceeds 2^13, the memory is not reused.

CraigMacomber commented 2 years ago

This reproduces in all versions of wee_alloc that can compile for me: current master, 0.4.5, 0.4.4, 0.4.3 The other versions (0.1.0, 0.2.0, 0.3.0, 0.4.0, 0.4.1, 0.4.2) do not build.

CraigMacomber commented 2 years ago

8190 and 8191 byte allocations (sum: 16381): leaks 8190 and 8190 byte allocations (sum: 16380): does not leak 8192 and 8189 byte allocations (sum: 16381): leaks 8192 and 8188 byte allocations (sum: 16380): does not leak 8193 and 8188 byte allocations (sum: 16381): leaks 9193 and 7188 byte allocations (sum: 16381): leaks 9192 and 7188 byte allocations (sum: 16380): does not leak 14193 and 2188 byte allocations (sum: 16381): leaks

16281 and 100 byte allocations (sum: 16381): does not leak 16181 and 200 byte allocations (sum: 16381): does not leak 16081 and 300 byte allocations (sum: 16381): does not leak

Looks like it leaks if the two allocations sum to 16381 (~2^14) or more, and both exceed some minimum size threshold, it leaks, except I found these cases which doesn't leak, so thats not 100%.

16081 and 2188 byte allocations (sum: 18269): does not leak 16081 and 16081 byte allocations (sum: 32162): does not leak

stephanemagnenat commented 2 years ago

In case that is useful for others, I also encountered this bug while compressing images using png-rs. After creating a simple test program that generates a random PNG, compresses it, decompresses it, and validates it, using wee_alloc leaks 1 GB per second on my desktop, while the default allocator keeps the memory within 4 MB.

CraigMacomber commented 2 years ago

In case that is useful for others, I also encountered this bug while compressing images using png-rs. After creating a simple test program that generates a random PNG, compresses it, decompresses it, and validates it, using wee_alloc leaks 1 GB per second on my desktop, while the default allocator keeps the memory within 4 MB.

I have confirmed your repro does leak memory.

My repro was from a deflate implementation. Png uses deflate for its compression, so I expect its nearly identical to my repro, except yours is native code instead of wasm. I prefer the no-wasm repro, so I have made a fork of your example based on my simplified 2 allocations version: https://github.com/CraigMacomber/wee_alloc_leak_png

CraigMacomber commented 2 years ago

Here is an updated repro.

My best guess for whats happening is retrieving space for the first allocation from the free list (which hits the edge case of the space being large enough, but not having enough extra space in the allocation to split the node in the free list) is incorrectly nulling out the head of the free list, leaking the other allocations remaining in the free list and causing the second allocation to have to allocate new memory.

use std::alloc::GlobalAlloc;

extern crate wee_alloc;

// Use `wee_alloc` as the global allocator.
#[global_allocator]
static ALLOC: wee_alloc::WeeAlloc = wee_alloc::WeeAlloc::INIT;

fn main() {
    loop {
        let l1 = core::alloc::Layout::from_size_align(85196, 1).unwrap();
        let l2 = core::alloc::Layout::from_size_align(80000, 1).unwrap();
        unsafe {
            // Allocate first block: this one gets reused from the free list,
            // but also causes the head of the free list to be set to null when it should still have space in it.
            let x1 = ALLOC.alloc(l1);
            // Allocate second block: this one does not reuse space from the free list (which seems like a bug).
            // At this point wee_alloc's free list's head (as used in alloc_first_fit) is null.
            // This results in a new allocation.
            let x2 = ALLOC.alloc(l2);
            println!("{x1:p}, {x2:p}");
            ALLOC.dealloc(x1, l1);
            ALLOC.dealloc(x2, l2);
        }
    }
}

The call stack when the head of the free list is set to null (when allocating the first allocation in the loop on the second iteration of the loop, from the second node in the free list). In "try_alloc" previous is the overall head of the list, and that line sets it to null. My understanding is that this causes it to forget about the first item in the free list (which was too small for this allocation, but would fix the next one).

try_alloc (/home/craig/Work/wee_alloc/wee_alloc/src/lib.rs:579)
{closure#0} (/home/craig/Work/wee_alloc/wee_alloc/src/lib.rs:923)
walk_free_list<wee_alloc::alloc_first_fit::{closure#0}, core::ptr::non_null::NonNull<u8>> (/home/craig/Work/wee_alloc/wee_alloc/src/lib.rs:903)
alloc_first_fit (/home/craig/Work/wee_alloc/wee_alloc/src/lib.rs:920)
alloc_with_refill (/home/craig/Work/wee_alloc/wee_alloc/src/lib.rs:938)
{closure#0} (/home/craig/Work/wee_alloc/wee_alloc/src/lib.rs:1049)
{closure#1}<wee_alloc::{impl#9}::alloc_impl::{closure#0}, core::result::Result<core::ptr::non_null::NonNull<u8>, wee_alloc::AllocErr>> (/home/craig/Work/wee_alloc/wee_alloc/src/lib.rs:1009)
with_exclusive_access<*const wee_alloc::FreeCell, wee_alloc::{impl#9}::with_free_list_and_policy_for_size::{closure#1}, core::result::Result<core::ptr::non_null::NonNull<u8>, wee_alloc::AllocErr>> (/home/craig/Work/wee_alloc/wee_alloc/src/imp_unix.rs:58)
with_free_list_and_policy_for_size<wee_alloc::{impl#9}::alloc_impl::{closure#0}, core::result::Result<core::ptr::non_null::NonNull<u8>, wee_alloc::AllocErr>> (/home/craig/Work/wee_alloc/wee_alloc/src/lib.rs:1007)
alloc_impl (/home/craig/Work/wee_alloc/wee_alloc/src/lib.rs:1047)
alloc (/home/craig/Work/wee_alloc/wee_alloc/src/lib.rs:1160)
main (/home/craig/Work/wee_alloc_leak_png/src/main.rs:24)
call_once<fn(), ()> (@core::ops::function::FnOnce::call_once::haa67c05d5f591be8:6)
__rust_begin_short_backtrace<fn(), ()> (@std::sys_common::backtrace::__rust_begin_short_backtrace::h4c4ff022fb2260fd:6)
{closure#0}<()> (@std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h690765e0f0300934:7)
core::ops::function::impls::_$LT$impl$u20$core..ops..function..FnOnce$LT$A$GT$$u20$for$u20$$RF$F$GT$::call_once::h443f738a8e9f947a (@std::rt::lang_start_internal::h52e73755f77c7dd9:226)
std::panicking::try::do_call::h1e21ba261ba489ec (@std::rt::lang_start_internal::h52e73755f77c7dd9:225)
std::panicking::try::h6afd48af8b6c96ac (@std::rt::lang_start_internal::h52e73755f77c7dd9:225)
std::panic::catch_unwind::h85dd95e0bab7fb60 (@std::rt::lang_start_internal::h52e73755f77c7dd9:225)
std::rt::lang_start_internal::_$u7b$$u7b$closure$u7d$$u7d$::h038455e697c8b03e (@std::rt::lang_start_internal::h52e73755f77c7dd9:225)
CraigMacomber commented 2 years ago

A different related repro (I have updated the example linked in the top post to include this):

fn main() {
    static WEE: wee_alloc::WeeAlloc = wee_alloc::WeeAlloc::INIT;
    loop {
        // These are magic sizes for when size categories are disabled.
        // If either of these sizes is decreased, it no longer behaves the same (no longer gets new allocations for both every time).
        let l1 = core::alloc::Layout::from_size_align(2033, 1).unwrap(); // Rounds up to 255 words.
        let l2 = core::alloc::Layout::from_size_align(2025, 1).unwrap(); // Rounds up to 254 words.
        unsafe {
            // Allocate first block: this one gets space that was use by the second allocation on the last iteration, but starting at an address 8 bytes lower.
            let x1 = WEE.alloc(l1);
            // Allocate second block: this one does not reuse space from the free list (which seems like a bug).
            // At this point wee_alloc's free list seems to only contain one item which does not fit.
            let x2 = WEE.alloc(l2);
            println!("{x1:p}, {x2:p}");
            WEE.dealloc(x1, l1);
            WEE.dealloc(x2, l2);
        }
    }
}