spacejam / sled

the champagne of beta embedded databases
Apache License 2.0
8k stars 381 forks source link

OOM while traversing large database. #1036

Open winstonewert opened 4 years ago

winstonewert commented 4 years ago

sled 0.31.0 rustc 1.42.0 (b8cedc004 2020-03-09) Ubuntu 19.10 Code:

fn main() {
    dotenv::dotenv().ok();
    pretty_env_logger::init();

    let db = sled::Config::new()
        .path("/mount/large/server/sled")
        .use_compression(true)
        .cache_capacity(256)
        .open()
        .unwrap();

    let wikidata = db.open_tree("wikidata").unwrap();

    wikidata.len();
    wikidata.len();
}

Expected outcome: Uses a small amount of memory Actual outcome: Uses several gigabytes of memory and gets killed.

The database is 4.7gb

Some debug logs: https://pastebin.com/e49teW5m

tokahuke commented 4 years ago

This also causes an OOM kill (before disk is exhausted) if let running long enough.

extern crate bincode;

fn main() -> sled::Result<()> {
    let db = sled::open("db")?;
    let t1 = db.open_tree("t1")?;
    let t2 = db.open_tree("t2")?;

    let mut bincode = bincode::config();
    bincode.big_endian();

    // This approximates somewhat my workload...
    let infinite_data = (0u64..).flat_map(|i| (0u64..100).map(move |j| (i, j)));

    for (i, j) in infinite_data {
        let key1 = bincode.serialize(&(i, j)).expect("can serialize");
        let key2 = bincode.serialize(&(j, i)).expect("can serialize");

        t1.insert(key1, vec![])?;
        t2.insert(key2, vec![])?;
    }

    Ok(())
}

Using sled 0.31 (and also latest master) on Ubuntu 18 and Rust 1.45 (nightly 2020-05-12). C'mon, man! I can try to live with sled using tons of space, but this is a deal breaker in my case (and basically any big data usecase). Hope it's easy to fix, though.

tokahuke commented 4 years ago

Yo, found a mitigation which might also help solve the problem... I found this comment here:

cache_capacity is currently a bit messed up as it uses the on-disk size of things instead of the larger in-memory representation. So, 1gb is not actually the default memory usage, it's the amount of disk space that items loaded in memory will take, which will result in a lot more space in memory being used, at least for smaller keys and values. So, play around with setting it to a much smaller value.

https://github.com/spacejam/sled/issues/986#issuecomment-592950100

Which led me to fiddle with the cache_capacity knob making it use 100_000 instead of the standard 1_000_000_000. What I have found (qualitatively):

winstonewert commented 4 years ago

As you'll see in my code, I set the cache capacity as low as I could without resolving my problem. So I wonder if we are hitting different issues.

tokahuke commented 4 years ago

Good question... I was reluctant in opening a new issue, though. For context, my issue is to load a big dataset. So it's a write problem, not a read problem. @spacejam has been in contact and told me it's a knwon issue, it seems.

spacejam commented 4 years ago

I'm currently looking into this approach for handling this issue: https://github.com/spacejam/sled/issues/1093

winstonewert commented 4 years ago

@spacejam That ticket references many inserts, which wasn't my issue. My issue is simply traversing a large database. It could be related issues for all I know, but wanted to make sure.