lightningdevkit / rust-lightning

A highly modular Bitcoin Lightning library written in Rust. It's rust-lightning, not Rusty's Lightning!
Other
1.15k stars 364 forks source link

A way to block concurrent connections to same node #2268

Open TonyGiorgio opened 1 year ago

TonyGiorgio commented 1 year ago

It's a bit difficult to judge whether or not LDK should attempt to connect to another node if it is not currently connected. Calling new_outbound_connection and then get_peer_node_ids below won't inform you that you're connected or that you're in the process of getting connected to a node, because it is being filtered out.

If you don't have a long enough sleep before looping through the reconnection logic, or if the reconnection logic is disconnected from the connect logic (by way of checking stored peers), then this becomes a slight issue.

LDK allows new_outbound_connection to be called twice in a row which will cause a Peer sent invalid data error on the second one after it tries to init both of the connections.

There could be a couple approaches to help, such as:

https://github.com/lightningdevkit/rust-lightning/blob/bb38ed3b2e6fb3664d93ade1ad11cc5a96d1164f/lightning/src/ln/peer_handler.rs#L825-L844

TheBlueMatt commented 1 year ago

LDK allows new_outbound_connection to be called twice in a row which will cause a Peer sent invalid data error on the second one after it tries to init both of the connections.

It is worth pointing out that this is ~harmless. LDK will keep the older connection and drop the new one, and things will continue just fine.

Return a boolean whether or not handshake is completed

Not sure I understand this one - return a bool from what? The handshake is done async.

Return all peers regardless of handshaking being completed (or have it in a different API call)

The issue is we may not know that there's an inbound connection with a given peer until we're done with the handshake, so I've always been a bit skeptical of exposing the pre-handshake outbound keys in the "connected keys" list (also because it'd be confusing - we'd have channels that are not-usable due to peer disconnection but the peer is connected).

LDK returns error on new_outbound_connection when a 2nd call to it has been made while handshake is still in progress

We could do this, yea, now that we ensure we time out connections that fail to handshake for a while. Probably worth doing.

May also be worth pointing out that at least lightning-net-tokio returns a future-future - it returns a future which completes when the connection is live (ie tcp handshake, not the lightning handshake), and then that returns a future which completes when the connection disconnects, ie when its time to reconnect.

TonyGiorgio commented 1 year ago

It is worth pointing out that this is ~harmless. LDK will keep the older connection and drop the new one, and things will continue just fine.

Okay that's reassuring to know. I kept seeing an error in our logs but things seemed okay so it's good to know it can be ignored safely.

Not sure I understand this one - return a bool from what? The handshake is done async.

It's ugly, but was just throwing out all ideas, I meant like this:

pub fn get_peer_node_ids(&self) -> Vec<(PublicKey, Option<NetAddress>, bool)> { 

May also be worth pointing out that at least lightning-net-tokio returns a future-future - it returns a future which completes when the connection is live (ie tcp handshake, not the lightning handshake), and then that returns a future which completes when the connection disconnects, ie when its time to reconnect.

That's interesting. Yeah we aren't using that, but could very well be something we can try to do on our side in some way. We could await in a non-blocking loop that checks if our connect_peer logic has finished handshaking with a peer, resulted in a disconnection, or timed out. I believe we could probably do that without any changes to the LDK APIs.

I initially thought it would be too difficult to do something like this without any changes to the LDK APIs but it sounds like this is harmless and there is a workaround (and it's not a problem for tokio environments) so feel free to close this unless you prefer that something changes with LDK.

TheBlueMatt commented 1 year ago

I think refusing a second connection if we already have one is a sensible change still, even if not high priority.