Closed johncantrell97 closed 2 years ago
Have introduced a global/shared "routing" peer manager. Its only purpose is to connect to peers to build and maintain the network graph. I have also refactored the peer connection and reconnection logic by centralizing the logic. The next step is to make sure the routing peer manager connects to some subset of the node's peers. It's unclear what the best logic for picking the subset. I think to start it will be something like ensure at least min(N, num(peers)) peers are connected where N is probably around 10. I think it probably makes sense to allow config/api control over the routing managers peers.
This also means there is no longer any distinction between any of the nodes. NONE of the nodes handle gossip messages or maintain the graph. They are all identical. This means we are even closer to removing the distinction altogether. The two pieces that are left are the port (currently the 'root' node gets to declare a fixed port instead of taking from the range) and the web admin. Currently the web admin login information is still tied to the credentials for the 'root' node. Need to think of the best way to separate it.
Ok this now supports proper remote router and scorer via configuration options. It surfaced a bug upstream in LDK with ChannelDetails serialization so this PR is likely now blocked until LDK 110.
One other outstanding issue with this PR is we currently rely on the NetworkGraph for reconnecting to channel peers. With a remote router our network graph doesn't have anything in it and so we can't lookup the announcement information to reconnect to our peers.
The two options to fix this are to add some kind of RemoteNetworkGraph with endpoints to support the query we need (address lookup by pubkey) OR we can store connection information about our peers. I'm leaning towards adding the connection information about our peers because even the network graph solution doesn't work for private nodes (without announcements to lookup in the graph).
The problem with using a directory is it can end up being stale in the case they announce new connection information and we're using a remote router. Seems like the best solution might be to do both.
This PR does the following:
Exposes new endpoints for finding a route and logging successful/failed payment attempts
Implements a RemoteScorer and RemoteRouter that uses a remote sensei instance for routing and handles bubbling payment success/failure back to remote so a global scorer can be updated.
Refactored all of the P2P (network graph, router, scorer, and network graph message handler) to truly be global to a sensei instance instead of each node having one that ends up only being maintained (and later shared) by a "root" node. This means on sensei startup there is now a network graph, router, scorer, and p2p gossip handler available without any nodes running.
This means we still need to designate a node to be the one that receives the p2p gossip. For now this is still the "root" node but we are now closer than ever to being able to remove this designation. In theory I think this could just get assigned to any running node on the instance. It's not clear to me though if you can "add" this in later should the current node handling gossip goes offline and we need to re-assign the gossip handler to another running node. Will ask in ldk discord shortly.
What's left is to add configuration flag and a way to choose between local and remote routing/scoring.