ZmnSCPxj / clboss

Automated Core Lightning Node Manager
MIT License
213 stars 31 forks source link

ChannelCandidateInvestigator should store more information #83

Open btweenthebars opened 3 years ago

btweenthebars commented 3 years ago

Based on discussion here

clboss should record why it made such proposal to make channel to a node such as by distance, earningfee, listpays. It should also keep track of how often a node is found by EarnedFee, ListPays or other matrics.

Not sure if now it only stores 30 nodes, but there shouldn't be limit. When clboss has fund and ready to make channels, it should calculate the score from all information to pick top nodes.

ZmnSCPxj commented 3 years ago

Not sure if now it only stores 30 nodes, but there shouldn't be limit.

A thing I have been contemplating actually would be to periodically remove older entries. The logic here is that a recommendation made long in the past may have gotten stale in the meantime, so it is better to just remove older entries in favor of newer ones.

So, rather than keeping track of how often it has been recommended by some ChannelFinder* etc. module, instead a re-recommendation would "refresh" the age of the candidate and reduce the chance it gets chosen during the above decimation.

Another important point is that since the algorithms in CLBOSS are publicly visible, it is much easier to arrange situations that would make the algorithms select particular nodes. Thus the channel finder algorithms have a fair amount of randomization involved actually --- Popularity does not select the most popular nodes, it uses a random lottery that is weighted on popularity (so it is more likely to select popular nodes, and in practice that i s what it does for most nodes, but the occassional unpopular node gets in, too). EarnedFee in particular is gameable --- create an artificial popular node (so that Popularity is likely to pick it up), with all the counterparties of the "popular" node actually under your control, then arrange constant rebalancings between your "popular" node and its counterparties, which increases the EarnedFee and makes it more likely that one of your faked counterparties gets selected by a CLBOSS node. That sort of thing. So I think you are overestimating the determinism of he channel finders --- they all have randomized components, usually Stats::ReservoirSampler (Populairty does not, since it predates that object, but it uses the same algorithm (A-Chao) and I think I have some weird optimization thing there which prevents it being refactored to use that object). This makes keeping scores based on how often something gets recommended not quite as useful.

Finally --- as mentioned, the main point of ChannelCandidateInvestigator is to investigate uptime of the node, which is the time-consuming part of node selection. It limits this to about eight attempts to connect to nodes per hour, because of course every connect attempt consumes bandwidth (both on TCP handshaking and on the LN protocol handshaking). This limit could be raised but not removed. So a limit on the number of candidates we keep track of is needed anyway; the question is whether 32 total candidates with an investigation rate of 8 per hour is good or should be raised.

btweenthebars commented 3 years ago

I was going through that table looking for nodes to open channels with. The first thing I considered was the median fee that the node set. Many on the list set pretty high fee, so they are unlikely to route a lot, and would be costly to rebalance. Then I looked for their patrons to see if I should make channels with them instead, basing on the fee/channel sizes to my original target nodes. Some were still not suitable, so I looked further by doing getroute (excluding some useless drained channels that I opened by mistake) to the recommended nodes and their patrons and getroute among themselves. Finally I found 1 node that set low fee with less than 3 hops away from my targets. So I just opened a big channel to this one instead.

Of course this could lead to centralization, but it's very practical. If clboss cares of decentralization it can try to find 3 of these nodes, for example, to have good reachability & redundancy to all the nodes & patrons on the list. So then it can remove/or mark the nodes on the list as solved.

So when clboss has some fund and ready to open N channels, should it do some getroutes and data crunching to find the N candidates among the candidates instead of lottery ?