Open matklad opened 6 years ago
It might also be interesting to compare with IntelliJ approach.
Unlike RLS, IntelliJ owns the text files, so it's internal data structures are mutable. However, there's only a single RW lock for all IntelliJ data, all request processing happens within a read lock (which guarantees you a consistent snapshot) unless it is a write request, in which case a write lock is held in the same way for the duration of the whole request (additionally, only a dedicated thread can hold a write lock).
FWIW, I am experimenting with "mutable state + readonly snapshots" approach in libsyntax2, and everything looks good so far. All the state data is mutably owned by the main loop, handlers receive consistent read only snapshots and are dispatched to the threadpool, and that all works without a single lock.
EDIT: to clarify, libsyntax2 has grown its own language server.
State & Snapshot approach sounds reasonable for me, and I'm really interested in improving concurrency. But, if we changed our concurrent architecture, how could we test and check that it's surely improved? @matklad Any idea?
I think the only way to test architecture is by trial and error :) In this particular case, the specific improvement I am looking for is the repeatable read invariant, and it's pretty clear that current approach with hidden mutexes does not yield one naturally.
There's a question of "do we really need strong semantics here in the first place?", and I'd like to argue that it indeed would be beneficial for long-term stability and maintainability.
I am looking at this function and I am having a hard time convincing myself it has any particular strong semantics:
https://github.com/rust-lang-nursery/rls/blob/fbec2de58e727b3b614001650d8d80a8d2b2f85b/src/build/mod.rs#L305-L316
My understanding is that
block_on_build
should guarantee that we have an up-to-date analysis so that write operations likerename
work correctly. I think thatblock_on_build
fails to achieve this goal:The loop spins on
building
atomic, which is reset inrequest_build
. However, the actual analysis results are prepared later, in analysis queue. So there's a window of opportunity, when the build per se is finished, but analysis results are staleLooks like analysis job might be cancelled which would unpark the waiting threads and cause them to read stale data.
Am I correct that we might after all, in theory, get a not so fresh analysis after
block_on_build
?More generally though, I feel the core of the problem here is that
AnalysisHost
is a mutable object which holds acurrent
version of analysis, wherecurrent
is not well-defined. In particular, the data stored in analysis can change in the middle of some request, and you can observe a violation of repeatable read.I wonder if we can improve the situation in a principled way? One approach which might work is to split all RLS data into mutable
State
part and immutableSnapshot
part. One part of the state is the current snapshot. All RLS operations then would at the begning read the current snapshot from the state, process it (potentially in parallel) and then respond with a result, unless the current snaphot have changed (i.e, we store an autoincrement version in the snapshot). RLS notifications will schedule the work for updating the mutable state in a serialized manner. That is, I want to guarantee that operations internally see a consistent snapshot of the world, and that all advances of the current state are centralized and serialized. In this world, the api forblock_for_build
might look like this:Does this sounds reasonable? These all is pretty hand-wavy as written but if we want to pursue a Clojuresque "mutable state singleton + readonly snapshots + versions to check for stale snapshots" approach I can try to sketch some code :)