lightningdevkit / rust-lightning

A highly modular Bitcoin Lightning library written in Rust. It's rust-lightning, not Rusty's Lightning!
Other
1.14k stars 357 forks source link

Async KV Store Persister #1470

Open TheBlueMatt opened 2 years ago

TheBlueMatt commented 2 years ago

Sensei ran into an issue where ldk-net-tokio (or some other async context) calls into ChannelManager, which handles some message and then goes to update the ChannelMonitor. Because Sensei is using the new KVStorePersister interface to simplify its persistence it ends up in a sync context in an tokio worker thread thread and then has to block on a future to talk to a (possibly remote) database. It then returns control to tokio by blocking on the future, but at the time we're still holding the channel and ChainMonitor locks, causing most other future execution to block.

While tokio should probably be willing to poll the IO driver in the block_on call from Sensei, even if it did it could still decide to run a different future which ends up blocking on the mutex already held and we're back to a deadlock.

Really if you're storing a monitor update via some async future you should use the TemporaryFailure return to do the monitor update asynchronously instead of blocking on IO with locks held, but we don't have good utilities for doing so today.

We need to (a) create a AsyncKVStorePersister (it has to either be runtime-specific because it needs the ability to spawn background tasks, as an alternative we could make the user provide a spawn method which we can call), and (b) adapt the background processor to using it for async manager/graph/score persistence.

TheBlueMatt commented 2 years ago

Oops missed that this is really just a specific instance of #1367, though one that we can fix easily.

tnull commented 2 years ago

I'd be interested in picking this up soon to get some exposure to the Tokio/Persist side of things.

TheBlueMatt commented 2 years ago

I think this is going to snowball into about 3-4 PRs and a large cleanup of the ChannelMonitorUpdateErr stuff, so I'd prefer to tackle this one if that's okay. There's a few places where ChannelMonitorUpdateErr::TemporaryFailure requires a persistence before returning, which won't work with a fully-asyc persister, so that'll need updating first.

tnull commented 2 years ago

I think this is going to snowball into about 3-4 PRs and a large cleanup of the ChannelMonitorUpdateErr stuff, so I'd prefer to tackle this one if that's okay.

Alright, go ahead!

G8XSU commented 2 years ago

Doubt: I am assuming async KVStore will return a future. So basically we move out of persistence blocking but we don't proceed through critical section of any channel state change(open/send/or anything) until future is completed right ?

TheBlueMatt commented 2 years ago

Yep, basically. We actually already support this, but its a bit unsafe - monitor update functions can return Err(ChannelMonitorUpdateRes::TemporaryFailure) which puts a channel on ice and doesn't transmit channel state update functions until the monitor update completes. The intent here is to use that feature to implement the future, but we need to first make that feature a bit more robust.

TheBlueMatt commented 2 years ago

Once we do this, we should also update the docs on the updatestatus's in-progress state variant. https://github.com/lightningdevkit/rust-lightning/pull/1106#discussion_r950579411

TheBlueMatt commented 2 years ago

This has turned into a very large project :(. #1678 is the first step, but this is basically the 0.1 milestone, at this point, so I'm just gonna tag it as such.