Rust analyzer is a long-lived process which oversees many concurrent activities.
The code for this is mainly in the main_loop.rs. Luckely, the amount of
concurrency we need is small, so the thing is still manageable. Nonetheless, it
is hard to understand, and, being the heart of rust-analyzer, it would be cool
to simplify it.
The list of concurrent activities we need to handle:
LSP requests and notifications
background request processing
cargo check/cargo metadata project loading
cargo check diagnostics
vfs
Cross-cutting concerns:
cancellation
during shutdown
when switching Cargo.toml, it makes sense to kill cargo check.
this is the bit which doesn't map nicely to trio's scopes.
eventually consistent non-concurrent updates
for cargo check / cargo metadata we want to run at most one process at a time.
If there's a current process, we should wait for it to finish and schedule a new one.
selection: in vfs, main loop and flycheck we need an ability to select among n concurrent activities.
Possible approaches:
1 Actorish model
This is mostly what we do today. Main thread owns all the state and has a
mailbox. Concurrent activities happen on background threads and communicate with
the main thread asynchronously and unidirectionaly, by posting the message
Project loading looks like this:
fn load_project():
spawn(|| {
send(Begin)
for {
send(Progress)
}
send(End(loaded_project))
})
fn main_loop():
let mut state = ...;
loop {
match next_event() {
Begin => start_progress(&mut state),
Progress => report_progress(&mut state)
End(project) => {
end_progress(&mut state)
switch_to(&mut state, project)
}
}
}
The benefit is that it's crystal clear what actually happens, as one can
dbg!(next_event) and get a linear history of all events as data.
The drawback is the tearing of control flow -- to understand a project reload
process (which conceptually is a single continous activity), one need to jump between load_project and main_loop.
2 High order actorish model
Rather than sending n different message types, we can have only one message type which packs a closure to be applied to the state
fn load_project():
spawn(|| {
send(|state| start_process(state))
for {
send(|state| report_progress(state))
}
send(|state| {
end_progress(state)
switch_to(state, loaded_project)
})
})
fn main_loop():
let mut state = ...;
loop {
next_event()(&mut state);
}
The benefit here is somewhat more linear control flow (though, everything still happens asynchronously) and extensibility with new "message types"
Also, because the set of functions is open, the set of actions supported by an Agent is also open, a sharp contrast to pattern matching message handling loops provided by some other languages.
The drawback is that the main loop is no-longer as obvious -- instead of a stream of events as data, you get a stream of closures.
Because there's no single place where state modification happens, it's harder to spot interferences between distinct concurrent activities
3 Mutexes
The third approach is to wrap the state into a mutex. The code looks exactly as
in the last example, with the difference that it actually is synchronous
(background activity resumes after critical section is finished, and not merely
scheduled).
The main benefit here is simplicity -- each activity is a fully straight-line code.
The main drawback is that we don't have a central place where all mutation happens.
Misc
It seems the following abstraction of a concurrent process might be helpful:
let task: Task<R> = Task::spawn(|stop_token| {
...
if stop_token.stop_requested() {
return -1;
}
...
92
});
// Two-phase cancellation: we first request a stop, then we wait for the task to
// actually finish.
let task: StoppedTask = task.request_stop();
drop(task);
Alas, to be useful we need a way to plug this (a vector of these actually) into
select, and its unclear how to do that.
Zulip discussion thread: https://rust-lang.zulipchat.com/#narrow/stream/185405-t-compiler.2Fwg-rls-2.2E0/topic/Concurrency.20in.20RA
Rust analyzer is a long-lived process which oversees many concurrent activities. The code for this is mainly in the
main_loop.rs
. Luckely, the amount of concurrency we need is small, so the thing is still manageable. Nonetheless, it is hard to understand, and, being the heart of rust-analyzer, it would be cool to simplify it.The list of concurrent activities we need to handle:
cargo check
/cargo metadata
project loadingcargo check
diagnosticsCross-cutting concerns:
cargo check
.cargo check
/cargo metadata
we want to run at most one process at a time. If there's a current process, we should wait for it to finish and schedule a new one.n
concurrent activities.Possible approaches:
1 Actorish model
This is mostly what we do today. Main thread owns all the state and has a mailbox. Concurrent activities happen on background threads and communicate with the main thread asynchronously and unidirectionaly, by posting the message
Project loading looks like this:
The benefit is that it's crystal clear what actually happens, as one can
dbg!(next_event)
and get a linear history of all events as data.The drawback is the tearing of control flow -- to understand a project reload process (which conceptually is a single continous activity), one need to jump between load_project and main_loop.
2 High order actorish model
Rather than sending n different message types, we can have only one message type which packs a closure to be applied to the state
The benefit here is somewhat more linear control flow (though, everything still happens asynchronously) and extensibility with new "message types"
https://clojure.org/reference/agents
The drawback is that the main loop is no-longer as obvious -- instead of a stream of events as data, you get a stream of closures. Because there's no single place where state modification happens, it's harder to spot interferences between distinct concurrent activities
3 Mutexes
The third approach is to wrap the state into a mutex. The code looks exactly as in the last example, with the difference that it actually is synchronous (background activity resumes after critical section is finished, and not merely scheduled).
The main benefit here is simplicity -- each activity is a fully straight-line code.
The main drawback is that we don't have a central place where all mutation happens.
Misc
It seems the following abstraction of a concurrent process might be helpful:
Alas, to be useful we need a way to plug this (a vector of these actually) into select, and its unclear how to do that.