Manishearth / rust-gc

Simple tracing (mark and sweep) garbage collector for Rust
Mozilla Public License 2.0
962 stars 50 forks source link

Add a weak type Gc pointer #148

Open nekevss opened 1 year ago

nekevss commented 1 year ago

After reading #65, I thought I'd try and implement a weak reference type Gc. Thought I would submit this as a draft PR for feedback to see what changes may need to be made regarding what's been completed.

andersk commented 1 year ago

Do we want to generalize weak references to ephemerons? Ephemerons can be used to implement weak maps, while plain weak references cannot.

Manishearth commented 1 year ago

Hmm. It's worth considering this from a perf POV as well. I wonder if there's a way to do this without imposing additional costs on GC users that don't need Weak.

And yeah, proper ephemeron support would be nice.

nekevss commented 1 year ago

One of the methods, which is mentioned in a couple places, may be to flag the box if a weak ref is present, ignore it on trace and add it to a queue, and then go through queue and run trace on the nodes flagged as weak to determine if it's target has a strong root. I had thought about that approach, but the only way I could think to do that way to update the Trace trait to contain something like:

pub fn weak_ref_trace(&self) -> bool;

I ended up going with the above because changing up Trace seemed little extreme at the time. Although, maybe that is the better option, especially given performance for just Gc users.

nekevss commented 1 year ago

@andersk and @Manishearth, I made a some changes to align a bit more with ephemorons as mentioned above. Is this more along the lines of what you were looking for? I opted for implementing a weak_nodes queue and weak_trace method onto Trace.

I should note that the changes made do differ a bit from the paper Racket cites regarding ephemerons by not adding the ephemerons found in weak_trace to the end of the weak_nodes queue in favor of just continuing the weak_trace, but that could be the wrong choice.

Manishearth commented 1 year ago

I'll let @andersk have a look first but yeah this does look closer? I think.

andersk commented 1 year ago

These still aren’t ephemerons. Ephemerons are pairs Ephemeron<K, V>, where a weak reference would then be Ephemeron<K, ()> (for which we might provide WeakGc<K> as a type alias or thin wrapper). Are you saying this implementation would be more easily extensible in that way? What would be involved? Any reason to wait?

TBH I’m not sure what it would mean to evaluate the correctness of this implementation, given that we currently have known correctness bugs:

so maybe we should think about fixing those first.

nekevss commented 1 year ago

Is that first issue still occurring for you regarding resurrecting dead objects (#124)? I'm not able to replicate the behavior (the object seems to print out the value for me).

andersk commented 1 year ago

Yes, it still reproduces on master at 6c5f754682e573e4590b52168dce835bba28beaf and on this PR at c8838e95eb770a626ca05a90b9bcede36107f4b2. There’s no reason it wouldn’t, as the diagnosis I gave in the issue still holds: there’s no way the is_marked() check in sweep() can ever return true, because every node has been unmarked by the end of mark().

I’ve opened #149 with my reproduction in the form of a failing test case.

nekevss commented 1 year ago

I took the test from the submitted PR and added it to this commit (hope you don't mind! Can definitely remove it from the PR if you'd prefer😄). The test you provided passes on my computer for both this current branch and my local master branch 6c5f75.

image

Is it possible for this to be hardware related?

Also, I still have to do some testing, but is this looking more along the lines of what you were thinking of? I added an Ephemeron implementation with a WeakGc and WeakPair wrapper. There's also pretty extensive changes to the mark and sweep setup to make it "Ephemeron aware", so to speak.