quark-zju / gcmodule

Garbage collection for Rust inspired by CPython's gcmodule
MIT License
20 stars 5 forks source link

don't track locked lock for the same reason as RefCell #4

Closed mio-19 closed 4 years ago

quark-zju commented 4 years ago

Thanks for the pull request. Multi-thread is hard and this makes me think about it more and discovered some issues (ex. ba9a12f6aeba18503687ed097e4cdd76f5173778).

Regarding on this PR, I think it's better to just panic immediately if the lock cannot be obtained, instead of blocking (the existing code) or ignoring (the PR). This is because the current code expects all Mutex to be released when the collector is running (or the collector does not run until they are released). If not, then it's a logic error that is ideally surfaced early so it gets noticed.

Right now, the multi-thread collector needs to "stop the world" to be safe and correct. More precisely, it needs to "stop other threads from modifying the object list and object dependency graph" while it is collecting things.

For the object list, the collector takes control of object creation and deletion so it can lock them accordingly. For the dependency graph, it becomes tricky. Because any T could have some internal mutability that changes what trace visits, and that'd be a nightmare for the collector.

To address the issue, the current solution is to prevent access to T while the collector is running. This is done by the ThreadedCcRef type. ThreadedCcRef is expected to be the only way to access a T stored in ThreadedCc<T> and ThreadedCcRef takes a lock so collector does not run. This is a stronger guarantee than just preventing modifications to the object dependency grpah. However it seems to work well and the Rust borrow checker helps avoiding issues.

Let's go back to Mutex. The only way for a thread to take a mutex lock is to get a MutexGuard object borrowed from a Mutex. The only way to get a Mutex in a ThreadedCc<Mutex> is to get a ThreadedCcRef first. So to lock there needs to be a borrow chain: ThreadedCc -> ThreadedCcRef -> Mutex -> MutexGuard. If ThreadedCcRef goes away, then the borrow checker ensures MutexGuard goes away too. And we already know that ThreadedCcRef and the collector are exclusive. So when the collector is running, there are no ThreadedCcRef and no MutexGuards.

There are some special cases, though. For example, ThreadedCc<Arc<Mutex<T>> allows mutating the T without going through ThreadedCcRef. How do we deal with that? Well, Arc is considered acyclic. Therefore from the collector's point of view, even if T is mutated by other threads, the dependency graph does not change.

Now, if you get this far and the above makes sense for you, could you update the code and comments about why we're doing this?