request: support weak references

rune-rs / rune

An embeddable dynamic programming language for Rust.

https://rune-rs.github.io

Apache License 2.0

1.76k stars 89 forks source link

request: support weak references #162

Open LeshaInc opened 4 years ago

LeshaInc commented 4 years ago

This code will slowly eat all available memory:

loop {
    let a = #{};
    a.a = a;  // if you comment this line, everything will be fine
}

It happens every time an object contains a reference to itself.

Memory leak can also be reproduced in Rust without using any objects:

pub struct Leaky {
    leak: Option<Shared<Leaky>>, // Shared can be replaced with Rc<RefCell<T>>
}

impl std::ops::Drop for Leaky {
    fn drop(&mut self) {
        println!("didn't leak");
    }
}

fn main() {
    let foo = Shared::new(Leaky { leak: None });
    let bar = foo.clone();
    foo.borrow_mut().unwrap().leak = Some(bar);
    // Nothing is printed, unless you comment the line above
}

I see a few solutions:

Disallow such cycles entirely
Add weak references
Run a tracing garbage collector infrequently to find cycles (hard to integrate with Rust though)

udoprog commented 4 years ago

Hey. Thanks for outlining this! It is indeed expected behavior right now and I'll queue it up to be documented in the book.

I'm not sure how to dynamically disallow such cycles. But I know some folks are interested in tinkering with integrating a gc, which to me would be the proper solution to dealing with cycles.

vi commented 4 years ago

Maybe weak references should be added in any case, even if cycles were not a problem?

udoprog commented 3 years ago

@vi I'm not against adding weak references. All though I personally don't use them very much. If anyone else wants to give it a stab, feel free!

udoprog commented 3 years ago

Note that most of the necessary plumbing to prevent the shared block from being de-allocated is already in place. All you'd really need to do for an initial implementation is try and take the interior value once the strong ref count reaches zero.

All though this has to be done in a way which doesn't change the size of the Shared<T> container, which might be a little tricky since need to distinguish between strong and weak reference. Possibly smuggling an aligned pointer marker could be useful here. Which would be done by asserting that the pointer to SharedBox<T> is aligned when created and mark weak references by setting the least significant bit.

udoprog commented 3 years ago

Started fiddling on the plumbing for a gc in the gc branch.

dranikpg commented 2 years ago

This is a very interesting issue for scripting languages but seems to be practically unsolvable, given that even the most popular runtimes like CPython would fail on this (if you add at least one more layer like a list)

"Just" tracing doesn't help because you don't know when to run it. Running it to often would destroy performance. Running it when memory is low would introduce gigantic latencies and would make your memory usage go in waves from ~0 to 100%.

This leaves you with either heap generations to cope with "short lived" objects (like the JVM does) or escape analysis (like golang)... and a fully fledged GC. As long as every non-trivial value sits behind a Shared<> escape analysis (at least to a very shallow depth) is possible - given that you know how to track the value afterwards 🤨

The problem is that small but entangled memory leaks can be very hard to detect and trace. Big ones (like allocating in a loop) are easy to detect and solve with just a single manual drop.

erlend-sh commented 1 year ago

This might be a viable option: https://github.com/kyren/gc-arena