lightningdevkit / rust-lightning

A highly modular Bitcoin Lightning library written in Rust. It's rust-lightning, not Rusty's Lightning!
Other
1.17k stars 365 forks source link

Drop reference to `Self` in `GossipVerifier::new` #3369

Open tnull opened 1 month ago

tnull commented 1 month ago

In GossipVerifier::new we currently force an argument of type signature of gossiper: Arc<P2PGossipSync<Arc<NetworkGraph<L>>, Self, L>>, which is not only terrible in it's own right, but also enforces the type signature of everything touching P2PGossipSync at compile time.

In an LDK Node context this would mean bubbling up generics all the way to Node, which we cannot and won't do, and it heavily conflicts with the ability to choose the GossipSync (P2P vs RGS) at runtime. As a consequence, this prohibits LDK Node (and I assume other users too) from using the GossipVerifier.

We really need to drop this circular type dependency, which means in turn dropping the U: UtxoLookup generic from P2PGossipSync and everything touching it and replace it with dynamic dispatching it at runtime, i.e., Arc<dyn UtxoLookup + Send + Sync> (although we possibly might get away with dropping the additional bounds here, we'll see).

TheBlueMatt commented 1 month ago

but also enforces the type signature of everything touching P2PGossipSync at compile time.

I mean this is generally true of Rust? If you have non-dynamic dispatch you have to specify the types of the things you're calling.

We really need to drop this circular type dependency,

I'm not really sure how, sadly. We do need things to be somewhat circular here, and short of intermediating responses through an event interface driven by the BP (or PeerManager) I'm not really sure how to do that.

replace it with dynamic dispatching it at runtime

There's no need to do that in LDK itself, any downstream application that doesn't want to deal with this can just write the types out as dynamic dispatch and they should be fine?

tnull commented 1 month ago

I mean this is generally true of Rust? If you have non-dynamic dispatch you have to specify the types of the things you're calling.

Right, but it makes these (circular) types really a pain to work with.

I'm not really sure how, sadly. We do need things to be somewhat circular here, and short of intermediating responses through an event interface driven by the BP (or PeerManager) I'm not really sure how to do that.

Right, the add_utxo_lookup method is an unfortunate necessity already, but we could at least drop the circular dependency in the generics.

There's no need to do that in LDK itself, any downstream application that doesn't want to deal with this can just write the types out as dynamic dispatch and they should be fine?

Mhh, I kind of disagree, at least in the LDK Node setting where we need to be able to switch between a P2PGossipSync and RapidGossipSync variants of lightning-background-processor::GossipSync at runtime, while also allowing some None value depending on the chain source (i.e., we only want to verify gossip for bitcoind). When I leave the type signature dyn UtxoLookup + Send + Sync, the compiler will shout at me at the point where I actually try to construct the GossipVerifier and hand it the concrete P2PGossipSync type:

72  |                         Arc::clone(gossip_sync),
    |                         ---------- ^^^^^^^^^^^ expected `GossipVerifier<RuntimeSpawner, ..., ...>`, found `Arc<dyn UtxoLookup + Send + Sync>`
    |                         |
    |                         arguments to this function are incorrect
    |
    = note: expected reference `&Arc<lightning::routing::gossip::P2PGossipSync<Arc<lightning::routing::gossip::NetworkGraph<_>>, GossipVerifier<RuntimeSpawner, Arc<RpcClient>, _>, _>>`
               found reference `&Arc<P2PGossipSync<Arc<NetworkGraph<Arc<FilesystemLogger>>>, Arc<dyn UtxoLookup + Send + Sync>, Arc<FilesystemLogger>>>`

After banging my head against the typechecker for a few hours and running into more issues of similar nature I concluded we should just drop the generic, which should make it much easier to deal with (or even possible) .

TheBlueMatt commented 1 month ago

When I leave the type signature dyn UtxoLookup + Send + Sync, the compiler will shout at me at the point where I actually try to construct the GossipVerifier and hand it the concrete P2PGossipSync type:

Can you provide a reproducer? That shouldn't be the case, AFAIU (though you may have to do an explicit cast when building the Arc).

tnull commented 1 month ago

Can you provide a reproducer? That shouldn't be the case, AFAIU (though you may have to do an explicit cast when building the Arc).

Sure. I did multiple attempts from different angles, but pushed the most-recent attempt in the last commit here: https://github.com/tnull/ldk-node/tree/2024-10-add-bitcoind-support-gossipverify

tnull commented 3 weeks ago

@TheBlueMatt Did you get a chance to look at the reproducer? Somewhat independently from that, would you be fine with going ahead with the approach outlined above? @vincenzopalazzo had previously indicated he'd be interested in picking this up for the next release.

TheBlueMatt commented 3 weeks ago

With this diff upstream:

diff --git a/lightning-block-sync/src/gossip.rs b/lightning-block-sync/src/gossip.rs
index f075cf7b2..083156baa 100644
--- a/lightning-block-sync/src/gossip.rs
+++ b/lightning-block-sync/src/gossip.rs
@@ -144,7 +144,7 @@ pub struct GossipVerifier<
 {
        source: Blocks,
        peer_manager_wake: Arc<dyn Fn() + Send + Sync>,
-       gossiper: Arc<P2PGossipSync<Arc<NetworkGraph<L>>, Self, L>>,
+       gossiper: Arc<P2PGossipSync<Arc<NetworkGraph<L>>, Arc<Self>, L>>,
        spawn: S,
        block_cache: Arc<Mutex<VecDeque<(u32, Block)>>>,
 }
@@ -162,7 +162,7 @@ where
        /// This is expected to be given to a [`P2PGossipSync`] (initially constructed with `None` for
        /// the UTXO lookup) via [`P2PGossipSync::add_utxo_lookup`].
        pub fn new<APM: Deref + Send + Sync + Clone + 'static>(
-               source: Blocks, spawn: S, gossiper: Arc<P2PGossipSync<Arc<NetworkGraph<L>>, Self, L>>,
+               source: Blocks, spawn: S, gossiper: Arc<P2PGossipSync<Arc<NetworkGraph<L>>, Arc<Self>, L>>,
                peer_manager: APM,
        ) -> Self
        where

this diff on top of your branch works:

diff --git a/src/types.rs b/src/types.rs
index 9fae37e..b021b30 100644
--- a/src/types.rs
+++ b/src/types.rs
@@ -89,7 +89,7 @@ pub(crate) type Scorer = ProbabilisticScorer<Arc<Graph>, Arc<FilesystemLogger>>;

 pub(crate) type Graph = gossip::NetworkGraph<Arc<FilesystemLogger>>;

-pub(crate) type UtxoLookup = dyn lightning::routing::utxo::UtxoLookup + Send + Sync;
+pub(crate) type UtxoLookup = lightning_block_sync::gossip::GossipVerifier<crate::gossip::RuntimeSpawner, Arc<lightning_block_sync::rpc::RpcClient>, Arc<FilesystemLogger>>;

 pub(crate) type P2PGossipSync =
        lightning::routing::gossip::P2PGossipSync<Arc<Graph>, Arc<UtxoLookup>, Arc<FilesystemLogger>>;
tnull commented 3 weeks ago

With this diff upstream: .. this diff on top of your branch works: ..

Nice, thanks for looking into it, that's a great first step!

However note that this would still require us to concretize the UtxoLookup type at compile time. While this would be fine right now where we our only gossip verifying chain source is the bitcoind RPC backend, we might need to keep this dynamic so that we can change it at runtime depending on the configured chain source, e.g., if we'd also want to add a REST variant, or possibly for BIP157/158, although the latter might be too inefficient anyways?

I guess at that point we could write a concrete adaptor implementing all the necessary traits and switching it internally, but given that LDK Node is likely not the only user struggling with this, why not just go the additional mile, simplify the generics, and use dynamic dispatch instead?

TheBlueMatt commented 1 week ago

Honestly maybe we should just bite the bullet and move the outbound message queue into NetworkGraph from P2PGossipSync and we can remove all the passing around of a reference to the message sender. Its pretty bad code layout but it would remove all the type insanity here.

tnull commented 1 week ago

Honestly maybe we should just bite the bullet and move the outbound message queue into NetworkGraph from P2PGossipSync and we can remove all the passing around of a reference to the message sender.

Ugh, that would be highly confusing tbh. If we need to do such a move, maybe it should be to a new QueueManager or whatever object that all can reference?

That said, maybe I should open a PR with the Arc<Self> diff to ensure we ship something that allows LDK Node to verify gossip after the next LDK release, as I imagine this refactor could turn out to be a larger endeavor?

TheBlueMatt commented 1 week ago

I'm not quite sure why that would be confusing? It wouldn't change the public API at all (aside from removing the need for the reference back to the P2PGossip entirely).

tnull commented 1 week ago

I'm not quite sure why that would be confusing? It wouldn't change the public API at all (aside from removing the need for the reference back to the P2PGossip entirely).

To me NetworkGraph is our data model for the graph, so storing internal message queues in there seems unrelated? I think we moved away from the network graph being just a data model in the node counters PR, but storing message queues seems like an entirely different affair altogether?

TheBlueMatt commented 1 week ago

To me NetworkGraph is our data model for the graph, so storing internal message queues in there seems unrelated? ...but storing message queues seems like an entirely different affair altogether?

Yes, indeed, its pretty gross for that reason, but at least its not publicly no longer "just a data model". Its not ideal but in terms of cleaning up the public API its a huge win...

I think we moved away from the network graph being just a data model in the node counters PR

Huh? The counters are just a part of the data model. Which is a weird data model, but its just a part of the data model.