wvwwvwwv / scalable-concurrent-containers

High performance containers and utilities for concurrent and asynchronous programming
Apache License 2.0
329 stars 16 forks source link

TreeIndex memory leak #156

Closed lithbitren closed 1 month ago

lithbitren commented 1 month ago
const ROUNDS: usize = 10;
const TIMES: usize = 1_000_000;

fn main() {
    println!("Initial memory usage: {} bytes", get_memory_usage());

    let start_time = std::time::Instant::now();
    for round in 1..=ROUNDS {
        let map = scc::TreeIndex::new();
        for index in 0..TIMES {
            map.insert(index, index).ok();
        }
        let memory_before_drop = get_memory_usage();
        drop(map);
        let memory_after_drop = get_memory_usage();
        let elapsed_time = start_time.elapsed();
        println!(
            "Round: {round:2}, Time elapsed: {:?}, Memory before drop: {} bytes, after drop: {} bytes",
            elapsed_time, memory_before_drop, memory_after_drop
        );
    }
}

use std::sync::atomic::{AtomicUsize, Ordering};
use std::alloc::{GlobalAlloc, Layout, System};

static INNER_MEMORY: AtomicUsize = AtomicUsize::new(0);

fn get_memory_usage() -> usize {
    INNER_MEMORY.load(Ordering::Relaxed)
}

struct TrackedAlloc;

#[global_allocator]
static ALLOC: TrackedAlloc = TrackedAlloc;

unsafe impl GlobalAlloc for TrackedAlloc {
    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
        let ret = System.alloc(layout);
        if !ret.is_null() {
            INNER_MEMORY.fetch_add(layout.size(), Ordering::Relaxed);
        }
        ret
    }
    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
        System.dealloc(ptr, layout);
        INNER_MEMORY.fetch_sub(layout.size(), Ordering::Relaxed);
    }

    unsafe fn alloc_zeroed(&self, layout: Layout) -> *mut u8 {
        let ret = System.alloc_zeroed(layout);
        if !ret.is_null() {
            INNER_MEMORY.fetch_add(layout.size(), Ordering::Relaxed);
        }
        ret
    }
    unsafe fn realloc(&self, ptr: *mut u8, layout: Layout, new_size: usize) -> *mut u8 {
        let ret = System.realloc(ptr, layout, new_size);
        if !ret.is_null() {
            INNER_MEMORY.fetch_add(new_size.wrapping_sub(layout.size()), Ordering::Relaxed);
        }
        ret
    }
}

output:

Initial memory usage: 56 bytes
Round:  1, Time elapsed: 204.116742ms, Memory before drop: 41539368 bytes, after drop: 41539368 bytes
Round:  2, Time elapsed: 398.779136ms, Memory before drop: 79576104 bytes, after drop: 79576104 bytes
Round:  3, Time elapsed: 599.772043ms, Memory before drop: 116249384 bytes, after drop: 116249384 bytes
Round:  4, Time elapsed: 804.836883ms, Memory before drop: 151544616 bytes, after drop: 151544616 bytes
Round:  5, Time elapsed: 1.017803271s, Memory before drop: 185447464 bytes, after drop: 185447464 bytes
Round:  6, Time elapsed: 1.242380062s, Memory before drop: 217945128 bytes, after drop: 217945128 bytes
Round:  7, Time elapsed: 1.448112848s, Memory before drop: 249020536 bytes, after drop: 249020536 bytes
Round:  8, Time elapsed: 1.646968021s, Memory before drop: 278655272 bytes, after drop: 278655272 bytes
Round:  9, Time elapsed: 1.849975862s, Memory before drop: 306839160 bytes, after drop: 306839160 bytes
Round: 10, Time elapsed: 2.061326166s, Memory before drop: 333552168 bytes, after drop: 333552168 bytes
wvwwvwwv commented 1 month ago

TreeIndex lazily deallocates memory in order to support lock-free read. A number of calls to ebr::Guard::new().accelerate() will help accelerate garbage collection.

However, I find TreeIndex::{clear, drop} a bit problematic - they don't remove individual entries, instead they just pass the entire map to the garbage collector. Then the garbage collector reclaims memory of each node while the node may hold a strong reference to another node; when reclaiming the memory, the linked node is passed to the garbage collector, and it will be dropped later. This linked list structure slows down the speed of garbage collection. I'll try to optimise this part.

changgyoopark-db commented 1 month ago

Now, garbage collection is ~five orders of magnitudes faster after clear/drop. Still, need to test the new code further - 2.1.17 will be released in 10 days.

changgyoopark-db commented 1 month ago

-> Just uploaded 2.1.17.