Look Into using -log((channel capacity + 1 - HTLC amount ) / (channel capacity +1)) for scoring

TheBlueMatt commented 2 years ago

Rene suggests it at https://github.com/rust-bitcoin/rust-lightning/pull/1166#issuecomment-970314672 noting that his research seems to indicate channel failure probability is strongly correlated with the above log. We will want to do benchmarks here to show its not materially slower.

renepickhardt commented 2 years ago

I think there has been a miss understanding. While log(HTLC amount/channel value) might work better than your current scoring function I would advice against using it.

Let me recap to resolve the misunderstanding:

The failure probability of a channel is: HTLC amount / (channel capacity +1)
The success probability is s_c(HTLC amount) := (channel capacity + 1 - HTLC amount ) / (channel capacity +1)
The success probability of an entire path of n channel c_1,...,c_n is defined as the product of the channel success probabilities p(c_1,...,c_n)=s_c_1(HTLC amount)*...*s_c_n(HTLC amount) (Technically you can set the first probability to 1 as you know your local balance and have full certainty. Similarly the last hop if you were given a route hint and you trust the recipient)

Now the goal is to find the most probable path. However the problem of finding a path where the product of the edge weights is to be maximized is a strange problem.

However because the log function is a group homomorphism from the multiplicative group of real numbers to the the additive group we can reformulate the problem as a problem to find the shortest path where the edge have the weight -log(s_c(HTCL amount)). Note that the log allows us to have the sum of weights (instead of the product multiplication of the probability) as in regular path finding algorithms and the Minus sign turns the maximization problem (of the probability) to a minimization problem (aka shortest path).

Thus the path minimizing -log(s_c_1(HTCL amount)) - .... -log(s_c_n(HTCL amount)) is the same that maximizes the product which corresponds to the path with the maximal probability that we are searching.

That all being said let's look at the misunderstanding.

The proper scoring function s_c(HTLC amount)= -log((channel capacity + 1 - HTLC amount ) / (channel capacity +1)) is not linear and rather convex. For being used in Dijkstra this does not matter as the runtime of dijsktra does not depend on the structure of the cost function. However if you want to solve a minimum cost flow the solvers become much fast if you were to have a linear function.

Thus what I suggested is that you could linearize s_c(HTLC amount) by looking at the linear term of the Taylor Series. With the help of Wolfram Alpha we see

s_c(x) = x/(c+1) + x^2/(2(c+1)^2) + x^3/(3*(c+1)^3) + O(x^4)

Thus the suggestion was that one can use the linearized version (HTLC amount)/(channel capacity + 1) instead of s_c(HTLC amount)= -log((channel capacity + 1 - HTLC amount ) / (channel capacity +1)). The remarkable observation is that the linearized version of the negative log probability corresponds to the failure probability (see above) which gives us a good intuition why this is a nice score / bias.

However using the linearized score has sever disadvantages too as it tends to saturate channels (similarly aas the current fee function tends to saturate channel) This becomes only relevant when one does a flow computation and doesn't use ad-hoc splitting before generating candidate paths. However flow computation is suggested as it will boost the reliability similarly to going to the probabilistic model.

Also using the linearized version is tricky when you want to use the conditional probabilities assuming you already have reduced uncertaintay as explained in detail in your other issue at https://github.com/lightningdevkit/rust-lightning/issues/1170#issuecomment-972396747 where I provide a more extensive elaboration on the rest of the theory and how to utilize knowledge from prior attempts

renepickhardt commented 2 years ago

btw one side note: the c-lightning team did actually try HTLC amount/channel capacity before they switched to the log probabilities. They did not mix the linearized version properly with the other features so the experimental results are not really comparable but the negative log probabilities worked better. you can find more at https://github.com/ElementsProject/lightning/pull/4771 and the evaluation on https://medium.com/@cdecker/d1cbb66f0699 also I have released a slide deck today explaining the basics behind the results. There is a video recording that should be sent to me soon and I am happy to share it here but I guess the comments in this and the other issue will be more useful.

TheBlueMatt commented 2 years ago

Ah, thanks for clearing that up, indeed, I didn't dig into the full paper and the comment on 1170 is rather opaque. Note that your above comment has a -1 in the numerator vs the c-lightning implementation uses a +1.

renepickhardt commented 2 years ago

Ah, thanks for clearing that up, indeed, I didn't dig into the full paper and the comment on 1170 is rather opaque.

Sorry for that. I actually sat down and took the time to summarize the relevant parts of the two papers for your given setting as the papers are even more opaque / abstract. There Ist a tl;dr with concrete todos in the summary that I provided to the c-lightning issue at: https://github.com/ElementsProject/lightning/issues/4920 but I thought you wanted to also see the reasoning behind the suggested solution.

Note that your above comment has a -1 in the numerator vs the c-lightning implementation uses a +1.

That is a typo here as it was late when I replied to you! +1 is correct. Good catch will fix this in a second.

jkczyz commented 2 years ago

Our Score trait is defined such that implementations return a penalty in msats to route an HTLC amount through a hop, which is added to the fees charged by the hop to obtain the cost. To compute such a penalty, is it simply enough to multiple -log(s_c(HTLC amount)) by the HTLC amount?

lightningdevkit / rust-lightning

Look Into using -log((channel capacity + 1 - HTLC amount ) / (channel capacity +1)) for scoring #1172