elsirion commented 9 months ago

Edit: TODO checklist:

[x] main refactoring change unblocking other improvements
[x] make it go brrrr (stream history)
[x] make it reuse history (downloaded epochs) so multiple modules don't duplicate work
[x] do something about weird add_state_machines_inactive_dbtx

Previous discussion

In the design described below a badly implemented module could stall the recovery of all other modules and we decided against it. I leave it here for historic context. For the actual proposal see https://github.com/fedimint/fedimint/issues/2977#issuecomment-1819626967

OUTDATED:

The current recovery interface was built in a hurry to bring the new client library to feature-parity with the old one so that a switch could happen. Unfortunately this lead to a design with multiple shortcomings mostly stemming from the fact that recovery is modeled as a module state machine and the different modules cannot easily interact.

Speed: The state machine executor is not built for heavy loads and computation cannot be easily parallelized. https://github.com/fedimint/fedimint/issues/2784
Storage Requirements: to mitigate loss of funds and allow for recovery in case of client bugs the state machine executor keeps a log of all intermediate states of a state machine. The large recovery state bloats that log and would need special handling, making the executor more complex.
No Shared Epoch Download: All modules need to scan the epoch history since the last backup for items belonging to the module/user. Currently each module would need to download the entire history itself to look for its item, resulting in a lot of unnecessary traffic.
Unclear Client State: While the recovery process is running the Client object already exists, but it is a programmer error to call any module methods that could trigger DB writes and thus interfere with recovery. Instead a recovering client should be a separate type.

Proposed Architecture

I already talked to @dpc about the following new architecture:

The build_restoring_from_backup returns a RecoveringClient struct
- That struct has very limited external functionality:
- RecoveringClient::subscribe_progress() -> watch::Receiver<Progress>: A way to track recovery progress
- impl Future<Output=Result<Client>>: the RecoveringClient is a future that can be awaited and returns the client on successful completion of recovery
- Internally, it consists of:
- A epoch fetcher task that downloads the epoch history and dispatches consensus items and transaction inputs/outputs to the respective module task. The remaining epochs and average processing speed are also used to update the progress watch::Sender.
- One recovery task per module, these receive consensus items and transaction inputs/outputs via a channel and process them. Most will be skipped for not belonging to the user. The channel can smooth out item processing latencies while still providing backpressure.
  - Every n processed epochs the task yields an intermediate state that gets persisted to disk. If the RecoveringClient is dropped before completion it will continue from there on next try.
  - Once finished the module recovery task returns module state machines to be added to the executor and an initialized client module struct.
- Maybe if needed: A thread pool that is given to all module tasks and can be used to run computation intensive tasks that would be ill-suited for the async runtime.

This design solves the problems mentioned above:

Speed: All modules run recovery in parallel by default, optionally we can add a thread pool.
Storage Requirements: Since recovery is not critical because it can always be restarted from the seed by definition we don't need to permanently log intermediate steps.
No Shared Epoch Download: All modules can use the same epoch download now.
Unclear Client State: Until the RecoveringClient future resovles no client functions are available.

justinmoon commented 9 months ago

What happens if client is restarted during recovery, and recovery is attempted again? Would you start all over again or just resume from where you left off?
subscribe_progress sounds very useful.

elsirion commented 9 months ago

What happens if client is restarted during recovery

Good that you bring that up, maybe the default build method would return some enum:

enum ClientOrRecovering {
    Initialized(Client),
    Recovering(RecoveringClient),
}

while the recovering build fn always returns a RecoveringClient?

EDIT: and yes, you'd resume from the last checkpoint.

justinmoon commented 6 months ago

Dev call: this description is a little outdated

dpc commented 5 months ago

New design:

handle to fetching epochs operation (probably on existing ClientContext)
slap LRU caching fetched epochs on top, so modules fetching the same stuff don't make queries for the same epochs over and over.
recovery needs to be async so a slow to recover module doesn't block the whole client
maybe some other stuff we decided on that I forgot now (@elsirion do you remember anything else?)

elsirion commented 5 months ago

Some more ideas:

Keep using state machines but add a property (in form of a member fn of IState/State) that says if the execution log should be persisted. That way we don't spam the client DB anymore.
Separate init fn on ClientModuleInit trait that return client module and recovery SM to be spawned (idk if there's a more elegant way now)
Modules should expose a way of tracking recovery progress, most importantly if it finished
- Any client module becomes usable once both the primary module and the module itself has finished recovery.

(Old notes: https://github.com/fedimint/fedimint/pull/3008#issuecomment-1791188772)

dpc commented 5 months ago

Separate init fn on ClientModuleInit trait that return client module and recovery SM to be spawned (idk if there's a more elegant way now)

We can just pass snapshot: Option<&[u8]> to init I guess...

dpc commented 5 months ago

I'm tempted to have async fn recover(&self, args: &ClientModuleInitArgs<Self>) -> anyhow::Result<Self::Recovery> and a whole new trait trait ClientModuleRecovery with something like;

trait ClientModuleRecovery {

  async fn make_progress(self) -> Option<Self>;

  fn progress(&self) -> f32;

}

and track them separately. Otherwise it will just pollute the normal client with an extra global mode everywhere.

@elsirion

elsirion commented 5 months ago

We can just pass snapshot: Option<&[u8]> to init I guess...

How would we distinguish "recover without backup" and "no recovery"?

elsirion commented 5 months ago

By having a make_progress fn, do you want to implement an entirely new mechanism of driving recovery forward? As I understood it we'd let the SM executor do most of the heavy lifting and just add methods to track progress.

dpc commented 5 months ago

By having a make_progress fn, do you want to implement an entirely new mechanism of driving recovery forward? As I understood it we'd let the SM executor do most of the heavy lifting and just add methods to track progress.

I'm thinking: Recoveries don't have conditionals and diverging states, so they don't need fancy state machines. We keep pulling make_progress and savekeep the progress in the db until None is returned and recovery is complete. After that the module can be ClientModuleInit::inited.

elsirion commented 5 months ago

But we don't want a separate high-level struct, i.e. RecoveringClient, right (that's what I first intended, but you had good arguments against that)? So Client would have a map of initialized modules and another with recovering ones?

dpc commented 5 months ago

So Client would have a map of initialized modules and another with recovering ones?

Yeah.

Recovering client module is really not client module. It can't do anything, so all methods would require a conditional returning "I'm not ready" etc. @elsirion

dpc commented 4 months ago

Pasting from a post I made earlier:

I think I'm going to have have the modules use their own database to preserve the recovery state as they please. They can use own one key to store the state and encode/decode it (like in the existing mint recovery code does) if they want, but also store it in multiple keys or do whatever else they want. Passing encodable state back and forth is a drag introducing extra type-system conversions etc and overhead of passing things back and forth. The original mint client recovery was written in the current style because it had to fit the state machines, but this is no longer a constrain. In certain situations it might not even be desirable/possible to fit all the recovery state in the the memory and pass it around.

Then I realized that having make_progress is also kind of limiting and pointless, and I'll just use tokio::sync::watch or something so that recovering module can send updates whenever they feel like it, and don't have to interrupt execution to return "progress". This makes things like pre-fetching, buffering, using streams much easier as the recovery code is just unconstrained future running to completion.

So all in all instead of ClientRecovery::make_progress, the global recovery code will be callingClientModuleInit::recoverwith someRecoveryArgson each start until that future completes without error (completed), which will be recorded in persisted store so on next startClientModulInit::init` can be called instead.

and then @elsirion voicing concerns:

I feel like FM should at least provide a common way/framework to build resilient backup recovery. Maybe we give full control to the module authors to do it themselves, but supply a generic function that lets modules define a recovery state machine and runs it, persisting and announcing progress regularly.

to which I would say that structurally module implementations messing their implementations is not a problem (doesn't block or break anything else in their Client), but indeed I'd love to have helpers for that.

Currently I'm planning to give the fn ModuleInit::recovery some RecoveryArgs very much like InitArgs and other than the normal accessors, it can have one or more "given this single step logic, returning this Encodable state, make the recovery happen correctly"

justinmoon commented 3 months ago

🎉

fedimint / fedimint

Rearchitect modularized backup recovery #2977

Edit: TODO checklist:

Previous discussion

Proposed Architecture